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FIELDS OF STUDY AND DEVELOPMENT OF MOTIVATION 
TO SEEK ADVANCED TRAINING' 


DONALD L. THISTLETHWAITE 
Vanderbilt University 


Follow-up of 1,086 academically talented college students, initially 
planning to enroll in one of 15 major fields of study, supports the 
hypothesis that faculty pressures and activities influence the student's 
desire to seek advanced training. Men who report that their teachers 
exert relatively strong press for independence, supportiveness, and 
affiliation—or who are exposed to Honors programs or to peer groups 
characterized by openness to faculty influence—tend to raise their 
aspirations for advanced training more than men not reporting such 
press. Plausible rival interpretations in terms of precollege charac- 
teristics were ruled out by covariance analysis. Differences between 
fields of study are greatest with respect to faculty press for humanism, 


scientism, vocationalism, and student social welfare orientations. 


It has been shown that colleges differ 
in the percentages of their graduates 
who later obtain the PhD degree 
(Knapp & Goodrich, 1952) or who later 
give evidence of becoming promising 
scholars (Knapp & Greenbaum, 1953). 
Aptitude differences between incoming 
freshman classes of these colleges do not 
account for all the observed variations. 
It has been found that the faculties at 
institutions whose graduates attain the 
PhD in greater numbers than would be 
expected on the basis of the aptitudes 
of their entering classes are perceived 
differently than the faculties at less 
productive institutions (Thistlethwaite, 
1959a, 1959b). Also some students 
making the decision to seek graduate 


1Data for this study were collected while 
the author was at the National Merit Schol- 
arship Corporation and this research was 
partially supported by the National Science 
Foundation, the Old Dominion Foundation, 
and by Ford Foundation grants to the Na- 
tional Merit Scholarship Corporation. 


or professional training during the col- 
lege years attribute their decision to the 
influence of college teachers (Gropper 
& Fitzpatrick, 1959; Stecklein & Eckert, 
1961; Thistlethwaite, 1960). One inter- 
pretation of these results is that some 
colleges are more successful than others 
in creating learning environments which 
motivate students to seek advanced 
training, and that these differences are 
in part, traceable to faculty behaviors 
and activities. 

An alternative interpretation of the 
observed differences in college produc- 
tivity is that they merely reflect differ- 
ences between colleges predictable on 
the basis of the sex, aspirational, or other 
personality characteristics of the institu- 
tion’s entering classes (Astin, 1961; 
Holland, 1957; Stuit, Helmstadter, & 
Fredericksen, 1956). Although such 
interpretative ambiguities are inherent 
in correlational studies, it is possible to 
progressively narrow the number of 
plausible rival hypotheses for explaining 
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the alleged effects of various learning 
environments. 

The present study incorporates four 
student background variables which 
previous research indicates are related 
to the institution’s output, and permits 
us to evaluate their adequacy in ac- 
counting for the observed associations 
between college press and the degree 
aspirations of college students. 

This paper has two purposes: first, 
to identify variables which differentiate 
between major fields of study, it pro- 
vides comparisons of different under- 
graduate fields with respect to faculty 
and student press and student values; 
and, second, to submit selected hypoth- 
eses to a more rigorous attempt at falsi- 
fication, it presents a covariance analy- 
sis of the effects of faculty and student 
press upon student plans to seek ad- 
vanced training. 


METHOD 
Sample 


This study is based upon a follow-up, 
at the end of the sophomore year of 
college, of exceptionally talented stu- 
dents who had received scholarships or 
honorary recognition awards in the 
third annual National Merit Scholar- 
ship competition. The designated sam- 
ple included random samples of all 
finalists (Merit Scholars or Certificate 
of Merit winners) who had indicated 
on a previous survey at the end of the 
first semester of the freshman year 
(Thistlethwaite, 1961b) that their prob- 
able major field of study was one of the 
fields of specialization listed in Table 1. 
Returns were received from 1,086 stu- 
dents, 67% of the designated sample. 
The response rate from Merit Scholars 
(92%) was greater than that from Cer- 
tificate of Merit winners (63%), and 
Merit Scholars are more heavily repre- 
sented among the respondents than in 
the original population of finalists in the 


1958 Merit program. In 1958, about 
13% of the finalists received Merit 
scholarships, while Merit Scholars con- 
stituted 18% of the sample analyzed 
here. The sample included students 
enrolled at 335 different colleges or 
universities, approximately two-thirds of 
which are privately controlled insti- 
tutions. 


Plans to Seek Advanced Training 


Each student was asked to indicate 
the highest level of education he 
planned to complete—both as he re- 
called his plans when he first entered 
college and as they were at the time of 
responding. In the following analysis 
reported plans at the beginning of col- 
lege are designated as “pretest,” and 
plans after 2 years of college as “post- 
test,” measures of aspiration.* Numeri- 
cal scores of 1-5 were assigned to the 
aspiration levels, as shown in Table 2. 


Field Press and Value Scales 


Each student was asked to describe 
the pressures and activities, intellectual 
and social, which characterized the fac- 
ulty and students in his major field of 
study, by responding to a 200-item In- 
ventory of College Characteristics. The 
inventory included 10 scales descriptive 
of the faculty in the subject’s major field 
(Achievement, Affiliation, Compliance, 
Directiveness, Enthusiasm, Humanism, 
Independence, Scientism, Supportive- 
ness, Vocationalism) and 10 scales de- 
scriptive of student associates in the 
subject’s classroom and living groups 
(Achievement, Aestheticism, Aggression, 


2It is difficult to say what bias, if any, is 
introduced by the use of retrospective reports 
for the pretest measure of aspirations. Pos- 
sibly genuine pretest measures would not cor- 
relate perfectly with retrospective pretests, 
but there is no a priori basis for expecting 
the latter to systematically bias the compari- 
sons, 
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Breadth of Interests, Competition, 
Openness to Faculty Influence, Partici- 
pation, Reflectiveness, Social Conform- 
ity, Scientism). Each scale contained 
10 items, five keyed so that a True 
response was weighted positively and 
five keyed so that a False response was 
weighted positively. Eighteen of the 
scales are similar to those described in 
an earlier study (Thistlethwaite, 1960) , 
except that approximately 20% of the 
items were revised to improve scale 
reliabilities. The average scale reliabili- 
ties (estimated by z transformations of 
r’s based on the Kuder-Richardson For- 
mula 20) were slightly improved as a 
result of these revisions—from .65 to .70 
for the nine revised faculty scales, and 
from .69 to .76 for the nine revised stu- 
dent scales. The two new scales added 
were those for assessing faculty press 
for Scientism and student Openness to 
Faculty Influence. Representative items 
(and the response weighted positively) 
from the two new scales follow: for 
faculty Scientism, “The department 
often urges the student to take his elec- 
tive courses in science or mathematics” 
(T); for student Openness to Faculty 
Influence, “In most courses students 
find ways to prevent instructors from 
dominating them” (F). The Kuder- 
Richardson reliability estimates for se- 
lected scales are summarized in Table 1. 

Students were also asked to indicate 
what they considered important require- 
ments for a satisfying job or career, by 
checking one of three responses (“highly 
important,” “important,” or “unimpor- 
tant”) for each of 12 described require- 
ments. The 12 items describing job 
requirements, segregated into three 
value scales on the basis of intercorrela- 
tions between responses, were as follows: 

1. Intellectual orientation—‘“give me 
an opportunity to live and work in the 
world of ideas,” “permit me to be cre- 
ative and original,” “provide an oppor- 


tunity to work on theoretical problems 
regardless of practical value,” “provide 
me with adventure,” “provide an oppor- 
tunity to use my special abilities or 
aptitudes.” 

2. Social Status orientation—“give 
me social status and prestige,” “provide 
me with a chance to earn a good deal 
of money,” “give me a chance to exer- 
cise leadership,” “enable me to look 
forward to a stable, secure future,” 
“provide an opportunity to work on the 
application of knowledge to practical 
affairs.” 

3. Social Welfare orientation—give 
me an opportunity to be helpful to 
others,” “give me opportunities to work 
with people rather than things.” 


Scores of 2, 1, or 0 were assigned to item 
responses, and total scale scores were 
computed for students on each of the 
three value scales.* 


RESULTS 
Characteristics of Fields of Study 


To study the power of the various 
scales to reveal differences among teach- 
ers or among students in different fields 


’Complete lists of items in each press and 
value scale, keys used in scoring responses, 
and more complete information on the reli’ 
ability and discriminatory power of each 
scale have been deposited with the American 
Documentation Institute. Order Document 
No. 7074 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of 
Congress; Washington 25, D. C., remitting in 
advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to: Chief, 
Photoduplication Service, Library of Con- 
gress. The faculty scale Vocationalism, was 
formerly called Pragmatism, while the student 
scale, Aestheticism, was formerly called 
Humanism (cf. Thistlethwaite, 1960). Ap- 
proximately half of the items used in the 
press scales were patterned after items in the 
College Characteristics Index (Pace & Stern, 
1958; Stern, 1958). Ten of the job require- 
ment items were identical with those used in 
the Cornell study (Goldsen, Rosenberg, Wil- 
liams, & Suchman, 1960). 
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an analysis of variance was made for 
scores on each scale. In the analyses for 
press scales, student reports were classi- 
fied by the major field each student 
indicated he was describing on the In- 
ventory of College Characteristics (some 
students changed major fields and the 
fields they described were not neces- 
sarily the same as the ones they planned 
to enter during the freshman year of 
college). Data on student values de- 
scribed the subject’s personal disposi- 
tions after 2 years of college; therefore, 
subjects were classified in this analysis 
by the major field of study they reported 
for the coming school year. 

Table 1 summarizes for each of 12 
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scales the F ratio and an index of dis- 
criminatory power—the proportion of 
variance of scores on the scale associated 
with the major field classification. This 
index was obtained by dividing the error 
variance (within field variance) by the 
total variance and subtracting this ratio 
from one. The tabled index is equiva- 
lent to epsilon squared (Nunnally, 1960; 
Peters & Van Voorhis, 1940) .* 
Although every one of the 20 press 


*When the basis of classification is a con- 
tinuous variable, as in subsequent tables, this 
index of discriminatory power may be con- 
verted to the equivalent product-moment cor- 
relation by simply obtaining the square root 
of the index. 


TABLE 1 


Facutty AND STUDENT Press AND STUDENT VALUEs IN 15 FreLps or Stupy 


Deviation scores * 


Major field Humanistic scales 


of study St 


Enth 


Fac 


Ind 


Pee 


scales scales*) Hum 


Aes Ref 


St 


—1.21 
—1.00 
—1.00 


Engineering 81 
Physics 62 
Chemistry 
Mathematics 
Business 
Biology ‘ 21 
Psychology 
Education 
Language 
Political science 
Sociology 
English 
Philosophy 
History 
Scale reliability 
F ratio 
Index of discrimi- 
natory power 


—.38 
45 


~A2 
—.30 
—.26 


-23 
37 
-33 
36 

35 
85 


335 .i01 .077 


—49 —.44 —.84 —.61 
—.37 
—.14 —.06 


-01 —.01 
—.37 


36.45 8.95 6.86 9.35 7.50 
-106 .085 


14 
46 


13 
34 
33 
17 
d 
14. 93 30. 37 7.62 15.73 7.77 6.58 


165 .294 .091 .240) .176 .091 .075 


* Deviation scores are expressed as a ratio between the deviation of the field mean from the grand mean and 
the square root of the within group variance. Positive deviations represent greater amounts of the press or value. 
To calculate the ¢ ratio for estimating the significance of the difference between means for any two fields, the 
algebraic difference between the tabled deviation scores may be divided by J 


» Because of incomplete data Ns for means on the value scales deviate slightly (never more than three) from 


the average Ns shown in this column. 


4 4 
4 

war 
Mis Xing 
m | st Fac Fac Fac 

Sei Sci Comp Voc sw SS Int 
98 83 47 1.22 |-—61 .48 —.27. 
80 50 .24 |-1.06 —09 .57 
| 60 31 | —61 .02 
| 34 57-10 —11 | —52—19 .30 
—25 .64 1.17 | —13 99 —56 
40 61 08 16 | —23-—.23 .21 
|-.07 13 —07 —33 | .45 —52 
.23-20 -05 | 43 —12 —02 
-—56 —40 .75 .33 | 82 —14 —41 
7 1 
t 
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scales, except that for faculty Support- 
iveness, revealed significant differences 
between fields at the .01 level, some 
scales had greater discriminatory power 
than others. Classification by major field 
accounted for more than 20% of the 
total variance of scores on the faculty , 
Humanism, Scientism, and Vocational- 
ism scales, while it accounted for less 
than 5% of the variance on four of the 
other faculty press scales and on seven 
of the student press scales. Accordingly, 
differences between fields are described 
in Table 1 only for the 12 scales having 
the greatest discriminatory power. In 
Table 1 both field and press scales are 
clustered by similarity of deviations from 
the average press or value score. 

The four physical science fields listed 
at the top of Table 1 (engineering, 
physics, chemistry, and mathematics) 
represent fields in which there tends to 
be relatively strong faculty press for 
scientism, compliance, and vocational- 
ism, but relatively little faculty enthusi- 
asm and weak faculty press for hu- 
manism and independence. Similarly, 
students in these fields tend to exhibit 
strong press for scientism but relatively 
little press for estheticism or reflective- 
ness. Students in the physical sciences 
tend to have weak social welfare orien- 
tations; only those in engineering have 
exceptionally strong social status orien- 
tations and only those in physics and 
mathematics seem exceptionally strong 
in intellectual orientations. 

At the bottom of Table 1, we find a 
cluster of humanities and social science 
fields (including history, philosophy, 
English, sociology, political science, and 
language) which are characterized by 
relatively strong faculty press for hu- 
manism, independence, and enthusiasm, 
but only moderate or weak faculty press 
for scientism, compliance, and vocation- 
alism. Students in these fields tend to 
exert strong press for estheticism and 
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reflectiveness, weak press for scientism, 
and tend to have relatively strong wel- 
fare orientations. 

The field of business resembles the 
physical science fields with respect to its 
low scores on the humanistic scales and 
its relatively strong faculty press for 
compliance and vocationalism. How- 
ever, neither teachers nor students in 
business place any appreciable stress 
upon scientism, Students majoring in 
business exhibited the strongest social 
status, and the weakest intellectual 
orientations, 

Biology, economics, psychology, and 
education are characterized by a “mid- 
dle” position on most of the press scales. 
While biology tends to have teachers 
and students who exercise strong press 
for scientism, it has only moderate devi- 
ations on the humanistic scales. Eco- 
nomics is deviant only with respect to 
its strong student press for reflectiveness. 
Education is highly elevated in faculty 
press for compliance, and moderately 
elevated in faculty press for vocational- 
ism. Psychology is neither strongly hu- 
manistic nor scientific in its press. On 
the value scales, both economics and 
education majors tend to have weak 
intellectual orientations. Economics ma- 
jors (like business majors) tend to have 
strong social status orientations, while 
psychology and, particularly, education 
majors tend to have strong social wel- 
fare orientations. The Social Welfare 
scale is the only one of the three value 
scales which discriminates between fields 
in a manner similar to the press scales, 
and is most highly related to the hu- 
manistic scales, 


Aspirations to Seek Advanced Training 


Table 2 shows that during the first 2 
years of college there was an increase in 
the proportion of these students plan- 
ning to obtain advanced graduate de- 
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TABLE 2 


CHANGEs IN Hicuest LeveL or EpucaTIon STUDENTS PLANNED TO COMPLETE 


Highest level student 
planned to complete 


Nea 


At beginning 
of college 


After 2 years 


Some college 12 


Bachelor’s degree 393 
Master’s degree 238 

Professional degree 
(MD, DDS, LLB, etc.) 155 
PhD or equivalent 274 
Totals 1,072 


of college 
Percentage 
difference 
1.1 6 6 —5 
36.7 198 18.5 —18.2 
22.2 352 328 10.6 
14.5 160 14.9 4 
25.6 356 33.2 7.6 
1,072 


at both stages. 


grees, while there was a decrease in the 
proportion planning to obtain only the 
baccalaureate degree. During this period 
33% raised, while 9% lowered their 
aspiration levels. The majority (58%) 
did not change with respect to the high- 
est level of education they planned to 
complete. 

In order to determine what control 
variables are most relevant for this 
sample, posttest aspiration levels were 
compared by analysis of variance for 
groups with different personal attri- 
butes.® Student value orientations were 
reported after 2 years of college, and 
consequently may reflect the effects of 
college experiences. Nonetheless they 
represent the best approximations avail- 
able concerning the input characteristics 
of this sample. The results.in Table 3 
indicate that initial aspiration level is 
the best predictor of posttest aspirations, 
while sex is the next best predictor. The 
only other personal attribute having an 


5In this, and other subsequent analy- 
ses involving classification by scale or test 
scores, extreme groups were determined ac- 
cording to whether the subject’s score fell 
above or below interval limits most closely 
approximating the median score of the dis- 
tribution of scores for all subjects. 


*Does not include 14 students whvu failed to indicate the highest level of 
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appreciable correlation with posttest as- 
pirations was initial major field of study. 


Personal Attributes Associated with 
Changes in Aspirations 


Some of the variables listed in Table 
3 are undoubtedly correlated; hence if 
we control with respect to the variable 
accounting for the largest proportion 
of variance (pretest aspiration level) we 
will at the same time be controlling with 
respect to the effects of some of the 
other antecedent variables. Accordingly, 
analysis of covariance in which posttest 
differences between groups were ad- 
justed for pretest differences in aspira- 
tions was used to compare the terminal 
aspirations of various groups. The re- 
sults in Table 4 show that when pretest 
aspirations are controlled, three of the 
remaining six control variables (apti- 
tude, social status, and social welfare 
orientations) fail to show any relation- 
ship to aspirations. Three personal attri- 
butes (sex, initial major field of study, 
and intellectual orientation) remained 
significantly associated with adjusted 
posttest aspiration levels, and of these 
sex is clearly the most highly related. 

The five initial major field groups 
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TABLE 3 


ANALYSIS OF VARIANCE FOR COMPARING ASPIRATION LEVELS 
AMONG Groups WITH DIFFERENT PERSONAL ATTRIBUTES 


Group with lowest and 
highest mean level of 
aspiration after 2 years 
of college 


Low High 


Initial aspiration level (2) 
Sex (2) 

Initial major field study (15) 
Aptitude test scores (2)° 
Intellectual orientation (2) 
Social status orientation (2) 


low 
women 
education 
low SQT 
weak 
strong 


high 

men 
physics 
high SQT 
strong 
weak 


498.68** 
217.93°* 
10.48** 
25.26%* 
43.59** 
6.20* 


4.77* 


Social welfare orientation (2) strong weak 


* Degrees of freedom for the numerator were one less than the number of groups, while df for the denomi- 


nator averaged 1,059. 
>Scores on the Scholarship Qualifying Test (V + M) administered in the eleventh grade of secondary 


*p < .05. 
01. 


cal science, business, and biology. Al- 
though a slight tendency may be ob- 
served for students with different major 
field preferences to change their aspira- 
tions at different rates, differences be- 


with the highest adjusted mean levels 
of aspiration were philosophy, physics, 
economics, chemistry, and mathematics, 
while the five with the lowest adjusted 
means were education, language, politi- 


TABLE 4 


CovaRIANCE ANALYSIS FOR COMPARING CHANGES IN ASPIRATION 
AMONG GROUPS WITH DIFFERENT PERSONAL ATTRIBUTES 


| Group with lowest and 

highest adjusted mean 

level of aspiration after 
2 years of college 


Proportion of 
residual variance 
associated with 
classification 


Classification and 
number of groups 


High 


Low 


48.40** 
2.26%* 
2.59 

14.95** 
1.38 

.25 


men 
philosophy 


women 
education 


Sex (2) 

Initial major field of study (15) 
Aptitude test scores (2) ns 
Intellectual orientation (2) weak 

Social status orientation (2) ns 
Social welfare orientation (2) ns 


strong 


*Deerees of freedom for the numerator were one less than the number of groups, while df for the 
denominator averaged 1,057. 
< 


59 
| 
| Proportion of 
aa number of groups F ratio* ciated with 
— classification 
317 
110 
.023 
.039 
il, .005 
school 
| 
042 
013 


60 


tween fields were not great enough to 
warrant relating changes in aspirations 
to the profile differences shown in Table 
1. Accordingly, in the following com- 
parisons students were treated as the 
unit of analysis, and were successively 
segregated by sex, intellectual orienta- 
tion, and initial major field of study. 


College Press Associated with Plans to 
Seek Advanced Training 


Each of the 20 press scales and the 
student’s report of whether he had par- 
ticipated in an Honors program at his 
college were used as a basis for classifi- 
cation into “high” or “low” treatment 
groups. Sigificant F ratios were ob- 
tained for the six of the 21 comparisons 
for men, but the F ratios for women 
were insignificant, except for the com- 
parison involving women reporting 
strong vs. weak faculty press for hu- 
manism, Weak press for humanism was 
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TABLE 5 


CovarRIANCE ANALYSIS FOR COMPARING CHANGES IN ASPIRATION 
AMONG MEN Exposep To CoLuece Press 


in this case associated with increased 
motivation to seek advanced training 
(F = 7.90, df = 1/459). 

Among men, increases in motivation 
to seek advanced training were associ- 
ated with faculty press for independ- 
ence, supportiveness, and affiliation, and 
with the absence of faculty press for 
vocationalism. In addition participa- 
tion in Honors programs and exposure 
to peer groups characterized by open- 
ness to faculty influence were related to 
increased motivation to seek advanced 
training. Only the results of analyses 
yielding significant F ratios for men are 
summarized in Table 5. 

Participants in Honors programs were 
also asked to described how their courses 
differed from those of students not in 
Honors programs. A content analysis of 
the write-in responses, reported in great- 
er detail elsewhere (Thistlethwaite, 
1961a), indicated that Honors courses 


- were seen as entailing changes in course 


Group with lower and 
Basis for classifying higher adjusted mean Proportion of 
students into treat- level of aspiration residual variance 
ment groups } F ratio associated with 
classification 
Low | High 
Honors program participation nonparti- _partici- 
cipants pants 8.68** 013 
Faculty press 
Independence* weak strong 15.38** .023 
Supportiveness weak strong 013 
Affiliation weak strong 4.53* .006 
Vocationalism strong weak 3.90* .005 
Student press 
Openness to Faculty 
Influence weak strong 5.71* -008 


* Results for comparisons based upon classification by faculty press for independence should be interpreted 


with caution because of heterogeneity of regression among treatment groups. 


*p< .05. 
**p < .01. 
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content (presentation of more advanced 
topics and problems, and an accelerated 
pace) , in instructional methods (greater 
use of seminar and tutorial instruction) , 
and in enforcement of curricular restric- 
tions (increased opportunity for inde- 
pendent study, waiving of course and 
graduation prerequisites). Clearly the 
content of these responses is highly con- 
sonant with the faculty press scores as- 
sociated with changes in aspiration (see 


Table 5). 


Intellectual Orientations and Press for 
Independence 


Do the effects of faculty press for 
independence vary according to the stu- 
dent’s intellectual orientation? Perhaps 
students with strong intellectual dispo- 
sitions need a considerable degree of 
freedom to develop their special inter- 
ests, while those with weak intellectual 
dispositions may benefit more from in- 
structors who are somewhat didactic 
and authoritarian. Unfortunately meas- 
ures of the student’s intellectual orien- 
tations at the beginning of college were 
not available, and subjects were of 
course not randomly assigned to treat- 
ment groups. Hence at best we can only 
look for trends that would tend to sup- 


port or refute this interaction hypothesis. 

Among men with strong intellectual 
values (at the end of the sophomore 
year) those reporting strong faculty 
press for independence were compared 
with those reporting weak faculty press 
for independence. A similar compari- 
son of press groups was made for men 
with weak intellectual orientations.* 
Table 6 shows the results. It can be 
seen that strong press for independence 
is associated with greater motivation to 
seek advanced training, both among 
men with strong, and among those with 
weak, intellectual orientations. Clearly 
there is no evidence of the expected 
interaction effect; on the contrary the 


®A 2 X 2 analysis of covariance (which 
would provide a separate test of the signifi- 
cance of the interaction effect) was consid- 
ered inappropriate since no valid test can be 
made of the effects of initial intellectual ori- 
entation. The obtained measures of the stu- 
dent’s intellectual orientations may reflect 
the effects of exposure to teachers; thus in 
such an analysis measures of effects associated 
with “value treatment” would be confound- 
ed with the effects of teachers, The compari- 
son of press groups at each of the two levels 
permits us to avoid such confounding, since 
at each level we are comparing groups both 
of which are deviant in the same direction, 
and to about the same degree, on the value 
scale. 


TABLE 6 


Press ror INDEPENDENCE AND CHANGES IN ASPIRATION AMONG MEN 
WITH STRONG AND WEAK INTELLECTUAL ORIENTATIONS 


Adjusted mean level of aspiration 


Weak press 


for 
A, | A, 
e 


Strong press 
for 


4.19 


3.54 


| 
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Proportion of * 

Intellectual Difference F ratio residual variance 

orientations associated with 

et. of students press classification 

Strong = 4.39 .20 5.38* 013 

(N = 326) i 

Weak 3.86 32 9.19% .029 

(N = 274) 

: 
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differences between the adjusted mean 
aspiration levels suggest that the effects 
of press for independence may be 
greater among men with weak intellec- 
tual orientations. 


Initial Field of Study and Press for 
Independence 


Thistlethwaite (1960) previously 
found faculty press for independence to 
be significantly associated with increases 
in motivation to seek advanced training 
among students majoring in the arts, 
humanities, or social sciences, but found 
no relationship among students major- 
ing in the physical or biological sciences. 
A similar analysis, summarized in Table 
7, was made for men classified by initial 
major field of study.’ It can be seen that 
the results parallel those reported in the 
earlier study: faculty press for inde- 
pendence is positively and significantly 
associated with the development of mo- 
tivation for advanced training among 
men initially planning to major in the 
humanities or social sciences, but no sig- 


TSince business majors were excluded in 
the earlier analysis, they have also been 
excluded in the analysis shown in Table 7. 


nificant relationship was found among 
men planning to major in the natural 
sciences, 


DiscussIoN 


Gropper and Fitzpatrick (1959) , who 
asked a large sample of undergraduate 
seniors when they made their decision 
to seek, or not to seek, graduate train- 
ing, found that 48% made their deci- 
sions during the third or fourth year of 
college, while only 27% made their 
decisions during the first or second year 
of college. Similar results were obtained 
by the author who studied the advanced 
training decisions of National Merit 
finalists (in the first National Merit 
program) who had completed college. 
Among these students 52% reported 
they made their decisions during the 
third or fourth year of college, while 
only 14% said they made their decisions 
during the first 2 years of college. In 
limiting the present study to the first 2 
years of college we are dealing with a 
population in which many students 
have made only minimal changes in 
aspirations, hence the variance of 
“change” scores will be markedly re- 
duced compared to those of students 


TABLE 7 


Press ror [NDEPENDENCE AND CHANGES IN ASPIRATION AMONG MEN 
IN DirFeRENT Fie_ps or Stupy 


Adjusted mean level of aspiration 


Proportion of 


Initial major 
field of study Weak press for 


independence 


Strong press for 
independence 


residual variance 
associated with 
classification 


Natural sciences* 4.05 4.19 
(N = 307) 
Social sciences 

and humanities” 


(N = 262) 


3.79 4.21 


14 1.92 .003 


42 14.05** .048 


* Natural sciences include: biology, chemistry, engineering, mathematics, and physics. 
>Social sciences and humanities include: economics, education, English, history, language, philosophy, 


political science, psychology, and sociology. 
< 01. 
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who have completed college. A related 
consideration is that 40% of these stu- 
dents entered college planning to seek 
the PhD, MD, or a comparable degree, 
and none of these students can be ex- 
pected to show marked increases in 
motivation for advanced training. Also 
it seems likely that differences between 
fields of specialization with respect to 
faculty and student press are greater 
during the junior and senior years than 
during the freshman and sophomore 
years of college. Thus we are dealing 
with independent and dependent vari- 
ables in which the range of variation is 
severely attenuated; under these condi- 
tions it is of course likely that relation- 
ships between college press and changes 
in aspirations will be underestimated. 
We would expect to find greater rela- 
tionships among students less highly 
selected with respect to ability and 
among students who have completed 
college. 

The most important result of the 
present study is the finding that, even 
under these experimental conditions, 
students exposed to some educational 
treatments can be shown to have raised 
their degree aspirations more than com- 
parable students exposed to other edu- 
cational treatments. Although the re- 
sults do not permit us to define in great 
detail the kinds of educational treat- 
ments that are conducive to the devel- 
opment of motivation for advanced 
training, they do suggest some of the 
features of colleges that influence men 
to seek graduate or professional train- 
ing. First, teachers who exert press for 
independence tend to be more effective 
in stimulating men in the social sciences 
or humanities. The salient features of 
such press are best defined by the items 
which differentiate most clearly between 
men, at each initial aspiration level, 
who raised or who did not change their 
aspirations. Men who were influenced 


to raise their aspirations more frequently 
reported: “there are many facilities (in 
the department) for individual creative 
activity,” “the faculty is not too busy 
to invent ways of encouraging initiative 
among students,” “there is opportunity 
for pursuing independent study under 
the supervision of faculty members,” 
“the department encourages students to 
undertake independent projects or 
theses,” “in class discussions, papers, 
and exams, the main emphasis is on 
the development of critical judgment,” 
and “a well reasoned report can rate an 
A grade here even though its viewpoint 
is opposed to the professor’s.” 

Secondly, the opportunity to partici- 
pate in Honors programs appears re- 
lated to increased motivation for ad- 
vanced training. As already indicated 
students perceive Honors courses to be 
different from non-Honors courses 
mainly with respect to the incorporation 
of teaching practices and instructional 
goals more characteristic of graduate 
study. Finally, teachers who are sup- 
portive appear to have the greatest in- 
fluence upon students’ aspirations. Such 
teachers are distinguished by the follow- 
ing attributes: willingness to discuss the 
students’ goals with them and to help 
them to discover their special talents, 
willingness to give special tutoring or 
counsel to students having difficulty, 
willingness to help students obtain re- 
dress of grievances, and care in sparing 
the students’ feelings when giving nega- 
tive evaluations. 

Student reports concerning the psy- 
chological characteristics of the environ- 
ments they have experienced during the 
first 2 years of college are probably most 
valuable as a record of the images which 
different fields of study present to the 
beginning student. It is possible that 
some fields present very different images 
to the beginning and to the advanced 
student. For example, psychology is seen 
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by the beginning student as neither 
strongly scientific nor humanistic in its 
press, but upperclassmen who are ma- 
jors in psychology might well perceive 
faculty and students as placing much 
greater stress upon science. The images 
presented to the beginning students, and 
the consistency of images at different 
levels of training, could in turn be re- 
lated to the ability of the discipline to 
attract and hold the interests of its 
talented undergraduates. 
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THE MOTIVATING EFFECT OF LEARNING 
BY DIRECTED DISCOVERY ' 


BERT Y. KERSH 
Teaching Research, Oregon State System of Higher Education, Monmouth, Oregon? 


High school students were taught 2 novel rules of addition by a pro- 
gramed booklet procedure. Subsequently, 44 of the 90 Ss were given 
individual guidance in discovering the explanation for the rules 
(“guided discovery”), ¥% were taught the explanation by a pro- 
gramed booklet (“directed learning’), and the remaining ¥ were 
given no further instruction (“rote learning”), A questionnaire and 
a test of recall and transfer given 3 days, 2 weeks, and 6 weeks 
later favored the Rote Learning and Guided Discovery groups, The 
questionnaire indicated that the Guided Discovery group practiced 
the rules during the time interval between the learning and test 
period more than Ss in other groups (chi square, p = .05). The data 
support the hypothesis that self-discovery motivates the S to practice 
more and thus to remember and transfer more than he might if taught 


directly. 


Advocates of the process of learning 
by directed discovery claim a number of 
advantages, most of which are included 
in a recent article by Bruner (1961). He 
has suggested that learning by discovery 
benefits the learner in four ways: it (a) 
increases the learner’s ability to learn 
related material, (b) fosters an interest 
in the activity itself rather than in the 
rewards which may follow from the 
learning, (c) develops ability to ap- 
proach problems in a way that will more 
likely lead to a solution, and (d) tends 


1This study was supported by a research 
grant from the Graduate School of the Uni- 
versity of Oregon, Eugene, Oregon. Grateful 
acknowledgment is due Jerome C. R. Li, 
Chairman of the Statistics Department, Ore- 
gon State University, for his invaluable assist- 
ance in the statistical analysis. Additional 
research assistance was ably provided by 
William R. Hogan and Arthur M. Jackson. 
The research was reported at the APA meet- 
ing in Chicago, September 1960, under a 
different title. 

2Teaching Research is an agency of the 
Oregon State Board of Higher Education, 
located on the campus of Oregon College of 
Education, Monmouth. 


to make the material that is learned 
easier to retrieve or reconstruct. 
Research evidence does not entirely 
support Bruner’s arguments. One of the 
more recent reviewers, Ausubel (1961, p. 
47), concludes “that most of the reason- 
ably well-controlled studies report nega- 
tive findings.” However, as is true in 
other areas of research, the evidence is 
somewhat equivocal, partly because it is 
difficult to equate studies in terms of the 
amount and kind of direction that is 
provided. The experimental subjects 
rarely if ever are required to learn com- 
pletely without help, and the kind of 
help provided .commonly differs. Con- 
sequently, there are studies which ap- 
pear to be somewhat contradictory, such 
as Craig’s (1956), in which the “di- 
rected” group learned and retained sig- 
nificantly more principles than the “in- 
dependent” group, and Kittell’s (1957), 
in which the group which received an 
intermediate amount of guidance was 
superior in learning, retention, and 
transfer to groups receiving either more 
or less direction. It has been suggested 
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that the “intermediate” amount of guid- 
ance provided by Kittell may have actu- 
ally exceeded the amount Craig pro- 
vided to his directed group (Ausubel, 
1961, p. 52). 

One of the few studies that forced 
learners to discover almost entirely with- 
out help provides data in support of the 
discovery process (Kersh, 1958). The 
contrasting directed treatment groups 
were superior in learning rate and im- 
mediate recall, but the “no help” group 
was superior in terms of retention and 
transfer after a period of approximately 
one month following the learning 
period. No evidence was produced to 
indicate that the no help group under- 
stood the rules better. Instead, an expla- 
nation was offered in terms of practice. 
On the basis of a subjective analysis 
of the subject’s comments written on 
the retests and reported to the experi- 
menter, it was concluded that the learn- 
ers were motivated to continue the 
learning process or to continue prac- 
ticing the task after the formal learning 
period, 

The present experiment was designed 
to provide formal data concerning the 
motivating power in question. 


HyporueEsis 


Each subject had the task of learning 
the following two rules of addition: 

1. Odd Numbers rule. The sum of 
any series of consecutive odd numbers 
beginning with 1 is equal to the square 
of the number of figures in the series. 
(For example, 1, 3, 5, 7, is such a series; 
there are four numbers, so 4 X 4 is 16, 
the sum.) 

2. Constant Difference rule. The sum 
of any series of numbers in which the 
difference between the numbers is con- 
stant is equal to one-half the product of 
the number of figures and the sum of 
the first and last numbers. (For exam- 
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ple, 2, 3, 4, 5, is‘such a series; 2 and 5 
are 7; there are four figures, so 4 X 7 
is 28; half of 28 is 14 which is the sum.) 

The rules can be learned by simple 
memorization of the task procedure as 
above. Further, the learner can become 
cognizant of certain relations which 
these rules bear to geometrical and 
arithmetical concepts, in which case it 
is assumed that his learning will be more 
meaningful. The definition of meaning, 
as well as the geometrical and arithmeti- 
cal relationships referred to are identi- 
fied in a previous publication (Kersh, 
1958). In the hypothesis statement be- 
low, the term “relationships” refers spe- 
cifically to those in the reference cited 
above and generally to comparable rela- 
tionships in related tasks. 

As will be explained below, the ex- 
perimental treatments in the present 
study differed primarily with respect to 
the extent of the external direction pro- 
vided the subjects in learning the rela- 
tionships referred to above. The present 
experiment was designed to test the 
following hypothesis. 

To the extent that the external direc- 
tion provided to the learner is lessened 
during his attempts to discover the rela- 
tionships which are considered essential 
to the understanding of a cognitive task: 
(a) the learner will tend to use the 
learned material more frequently after 
the learning period (i.e., to extend the 
practice period voluntarily) and, as a 
result, (b) he will remember it longer 
and transfer his learning more effec- 
tively. 

It should be noted that the hypothesis 
is written in two parts and that the 
second is dependent upon the first. 


PROCEDURE 


A total of 90 high school geometry students 
was utilized, having been selected from a 
larger group on the basis of a pretest cover- 
ing the arithmetical and geometrical concepts 
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and procedures that were considered essen- 
tial prerequisites to the tasks used in the 
experiment. The entire sample was then 
taught the two rules of addition given above 
by being simply told the rules and given 
practice in their application. They were 
taught by a programed booklet procedure to 
the same criterion, six successive applications 
of each of the two rules, Thereafter, the 
subjects were divided at random into three 
main groups of 30 each, and each group was 
treated differently. 

One group, called the Directed Learning 
group, was taught the rules and their expla- 
nation entirely by a programed learning tech- 
nique. Each subject learned from a booklet 
in which the learning process was broken 
down into smaller steps, and answers to ques- 
tions or solutions to problems were revealed 
to the subject whether he responded correctly 
or not. 

A second group, called the Guided Discov- 
ery group, was required to discover the 
explanation with guidance from the experi- 
menter. The subjects in the Guided Dis- 
covery group were taught tutorially using a 
form of Socratic questioning which required 
each subject to perform specific algebraic 
manipulations and to make inferences with- 
out help. The guidance was a practical ex- 
pedient, since it was necessary to control 
between groups the quality and quantity of 
the relationships used in explaining the rules. 

The final group was called, appropriately, 
the Rote Learning group since the explana- 
tion for the rules was omitted. This treat- 
ment was incorporated in the research design 
primarily as the control for “meaningful” 
learning. 

After the learning period of the experi- 
ment, a test of recall and transfer was given 
to subgroups of each treatment group after 
3 days, 2 weeks, and 6 weeks. For this pur- 
pose each of the three main groups was 
divided into three subgroups of 10 each. 

The test consisted of two problems and a 
short questionnaire. The problems were given 
first with instructions to show all work 
including scratchwork. The two test prob- 
lems were as follows: 

1. John’s employer agrees to pay him 
$1.00 for his first day of work and increase 
his pay by $2.00 each day. How much will 
he receive for the first month’s work if he 
works all 30 days? 

2. A man is left a sum of money by an 
eccentric relative. The will states that he 
will receive $10.00 the first month and 
that each successive monthly payment will 


be increased by $5.00 (i.e., he will receive 
$10.00 the first month, $15.00 the second 
month, $20.00 the third month, etc.). His 
monthly payment at the end of four years 
is $245.00. What is the total amount he 
has been paid by that time? 


The questionnaire asked the subject to 
state each rule, using examples if necessary, 
and to report whether or not he made use 
of the rules after the formal learning period. 


RESULTS 


The number of subjects in each group 
who used the appropriate rule in an 
acceptable way on the test was em- 
ployed as the index of transfer. Accept- 
able use of a rule for the first test 
problem meant the use of either rule to 
obtain the solution; for the second test 
problem, only the Constant Difference 
rule was acceptable. Computational ac- 
curacy was not required. 

The number of subjects in each group 
who wrote an acceptable statement of 
each rule was used as a measure of pure 
retention. To be acceptable, each sub- 
ject’s statement had to be complete and 
accurate, but not necessarily in the same 
words as the original statements, Errors 
in spelling or grammar were overlooked. 

Table 1 presents the number who 
used and stated the rules in the accept- 
able way on the test problems. A total 
of 90 subjects served as the basis for the 
data in Table 1, 10 subjects per cell. 

In the statistical analysis, use was 
made of a chi square technique devised 
by Li (1957, p. 416-20). The data in- 
cluded under each of the columns of 
Table 1 were envisioned as a 2 X 9 
contingency table, with 8 df. Four sep- 
arate analyses were then conducted, 
each of which broke down the chi 
square into the following components: 
(a) differences between teaching treat- 
ments (2 df), (b) differences between 
test periods (2 df), and (c) differences 
attributed to interaction of treatments 
and time periods (4 df). 
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TABLE 1 


NuMBER OF Susjyects (or 10 1n Ceti) Wuo Usep anp STATED THE 


RuLes CorrEcTLY ON THE RETEST 


Used Rules 


Odd 
Numbers* 
1 


Treatment Groups 


Constant 
Difference Numbers* Difference* 
2 4 


Rote Learning: 
3 days 7 
2 weeks 7 
6 weeks 


on 
non 
a © 


Guided Discovery: 
3 days 6 
2 weeks 
6 weeks 


ow 


w 


Directed Learning: 
3 days + 
2 weeks + 
6 weeks 0 


3 3 4 
3 1 3 
3 1 1 


-05 level. 


None of the interaction effects was 
significant, indicating that the rate of 
forgetting did not differ significantly 
across the teaching treatment groups. 
A trend analysis of the test data indi- 
cated also that the rate of forgetting was 
constant for all groups (Li, 1957, pp. 
226-233). 

Otherwise, as pointed out by the foot- 
note references in Table 1, the differ- 
ences between treatment groups and be- 
tween test periods were found to be 
significant for all columns except that 
headed “Constant Difference 2,” for 
which the observed differences were 
found not to be reliable. 

Perhaps the most striking finding in 
the present study is that the Rote Learn- 
ing group was found to be consistently 
superior in every respect to the other 
treatment groups. Although this com- 
pletely unanticipated finding has no di- 


* Differences between treatment groups and between test periods signifacant by chi square at or beyond 


rect bearing on the hypothesis in ques- 
tion, it does nevertheless bear clearly 
upon the related question of “meaning- 
ful vs. mechanical” learning. This find- 
ing will be discussed in a subsequent 
section. 

Strictly speaking, the hypothesis 
which the present experiment was de- 
signed to test involves only the Guided 
Discovery and Directed Learning treat- 
ments. To support the major hypothe- 
sis, the data should have shown that the 
subjects comprising the Guided Discov- 
ery group used the rules after the learn- 
ing period more frequently than the 
subjects in the Directed Learning group; 
and, if so, that the former remembered 
and transferred the rules more effec- 
tively than the latter. 

With respect to the frequency of us- 
ing the rules after the learning period, 
the results do support the hypothesis. 
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Although the number of subjects in 
each group who reported that they did 
use the rules was very small, the differ- 
ence between the frequency patterns of 
the two groups in question is statistically 
significant. Eleven subjects of the 30 in 
the Guided Discovery group reported 
that they had used the rules as com- 
pared with two subjects in the Directed 
Learning group. In the Rote Learning 
group, six subjects of 30 reported in the 
affirmative. 

With respect to the relative perma- 
nence of the retention and increased 
transfer effects, the results also support 
the hypothesis. The Guided Discovery 
group is clearly superior to the Directed 
Learning group 3 days after the learn- 
ing period, and since the rate of for- 
getting may be presumed to be approxi- 
mately the same for each treatment 
group (see statistical analysis above), 
their initial superiority remains after 6 
weeks, 


Discussion 


The data from this present experi- 
ment do not support the generalization 
that learning by a process which in- 
volves discovery is necessarily superior 
to learning by more highly directed 
processes. Indeed, these data suggest 
that under certain conditions of learn- 
ing, highly formalized “lecture-drill” 
techniques, ordinarily considered sterile 
and meaningless, produce better results 
than techniques which attempt to devel- 
op “understanding.” 

One explanation for the present re- 
sults is that they reflect a simple and 
well known phenomenon, retroactive 
inhibition. The experimental efforts to 
inject meaning into the rules amounted 
to following their initial rote learning 
with a closely related and complex 
learning task; thus the Rote Learning 
group may have surpassed other groups 
simply because retention among the 


latter was inhibited by the interpolated 
learning. 

How may the present results be rec- 
onciled with those of the previous ex- 
periment by Kersh (1958), in which 
learning by discovery proved markedly 
superior? The preferred interpretation 
is that the findings of the two studies are 
actually complementary. Schematically, 
the treatments employed in the two ex- 
periments may be compared on a line 
representing the continuum of learning 
processes: at one extreme, learning 
without any external direction whatso- 
ever (true self-discovery) ; at the other, 
learning by lecture-drill processes (rote 
learning), as follows: 


Rule 

Given 
j 

Guided Directed Rote 
Discovery Learning Learning 


1958 No Direct 
EXPERIMENT HelpReference 


PRESENT 
EXPERIMENT 


As is indicated above, the Direct Ref- 
erence treatment in the 1958 experi- 
ment is comparable to the Guided Dis- 
covery treatment in the present one; 
similarly, the Rule Given and Rote 
Learning groups correspond, The pres- 
ent experiment has no counterpart to 
the No Help treatment of the previous 
study; and, in the previous one, the 
Directed Learning treatment was not 
represented. 

When compared as above, the results 
of the two experiments are remarkably 
similar. The initial achievement of the 
comparable groups in both experiments 
was very high then dropped to where 
only about half of each group was able 
to recall and apply the rules after 4 to 
6 weeks. In each experiment the dif- 
ference in the performance of the Rote 
Learning and Directed Discovery groups 
was not notable; if anything, the Rote 
Learning groups tended to perform 
slightly better. 

With respect to the motivating power 
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of learning by discovery, in the 1958 
experiment the superior performance 
of the No Help subjects on the retest 
together with their written comments 
and verbal reports to the experimenter 
strongly evidenced their increased inter- 
est. The present results leave no doubt 
that there is a tendency for interest to 
accrue as a result of learning by dis- 
covery. 

The results of both experiments also 
are consistent in their failure to support 
the notion that attempts to provide 
added meaning will necessarily prolong 
memory for rules and procedures and 
will enhance their transfer. On the con- 
trary, both experiments suggest that 
such attempts may well do more to 
interfere with learning than enhance it. 
This does not mean that rote learning 
is superior to learning with understand- 
ing. Rather it means that we need to 
know much more than we do about 
meaningful learning and how we come 
by it. 

The relatively poor showing of the 
Directed Learning group in the present 
study is partially explained by the sub- 
jects’ reported failure to practice the 
rules after the learning period to the ex- 
tent that the subjects did in other groups. 
Why the Rote Learning treatment gen- 
erated more interest than the treatment 
in question again may reflect nothing 
more than that the original learning was 
inhibited by the interpolated programed 
learning. The subjects’ unfamiliarity 
with the instructional procedure may 
have contributed to their confusion. 

Most certainly the data from the two 
experiments under discussion suggest 
that the frequently taught principles of 
learning that pertain to self-discovery 
and meaning (see introduction) should 
be restated or qualified. The following 
statements are offered for further study. 

Learning by self-discovery, Learning 
by self-discovery is superior to learning 
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with external direction only insofar as it 
increases student motivation to pursue 
the learning task. If sufficiently moti- 
vated, the student may then continue 
the learning process autonomously be- 
yond the formal period of learning. As 
a result of his added experience, the 
learner may then raise his level of 
achievement, remember what he learned 
longer, and transfer it more effectively. 
The explanation for the elusive drive 
generated by independent discovery is 
not evident, but several have been of- 
fered, including the Zeigarnik effect of 
superior memory for unfinished tasks 
and the Ovsiankina effect of resumption 
of incomplete tasks.* It also could be 
explained in terms of operant condition- 
ing; specifically, as a kind of “searching 
behavior” reinforced by the experimen- 
ter’s comments and by the subject’s own 
successful progress toward a solution. 
Whatever the explanation, the motivat- 
ing power evidently does not appear in 
strength unless the student is required 
to learn almost completely without help 
and expends intensive effort over a 
period of 15 minutes or more. 
Meaningful learning. Aside from the 
advantage the student may come to 
have academically, he may not benefit 
from knowing the explanations for rules 
and procedures he learns, i.e., the pat- 
tern of relationships involved. That 
which is meaningful (understood) may 
or may not be retained longer and trans- 
ferred more effectively than that which 
has been learned by rote. Moreover, 
superficial efforts to gain understanding 
after a rule or principle has been memo- 
rized may have an inhibitory effect 
when the student attempts to recall and 
transfer the original learning. If it is 
important only that the task be under- 


’The author is particularly indebted to 
Julius M. Sassenrath and the late Percival 
M. Symonds for their critical comments and 
suggestions. 
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stood (as is most often the case, presum- 
ably) , the essential relationships may be 
learned most economically when taught 
by another person or teaching program, 
not by process of self-discovery. 
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ABILITIES AND TEACHER RATINGS OF 
HIGH SCHOOL STUDENTS 
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The stability of factors of divergent thinking (DT) in Ss of school 
age and the relationship between factor scores and performances of 
the Ss remain to be defined. 78 boys and 113 girls with IQs of 115 
and higher were administered DT tests as 10th and 11th graders. 
3 factors were found common to both sexes and relatively stable over 
a 12-month interval. 15 of 28 r’s between factor scores and teacher 
ratings of the Ss’ DT performances as 11th graders in English, mathe- 
matics, science, and social studies were low, positive, and significant 
at or beyond the .10 level. The nature of the DT factors and the 
expression thereof in classroom activities varied markedly according to 


sex and subject field, 


Many factors of divergent thinking 
identified in young adults by Guilford, 
Kettner, and Christensen (1956) re- 
main to be identified in younger males 
and females of school age. Much re- 
search must be done to determine the 
developmental levels at which specific 
factors in divergent thinking become 
differentiated out of general or group 
factors, assumed to be present in chil- 
dren of preschool age. Also, factor 
scores, derived from responses to test 
items, are yet to be described in terms 
of psychologically meaningful perform- 
ances of school-aged children, In this 
paper, the relationships between factors 
of divergent thinking and teacher rat- 
ings of the performances of academical- 
ly talented high school boys and girls in 
various subject fields are reported and 


discussed. 


PROCEDURE 
Subjects 


The students were 78 boys and 113 girls 
with IQs above 115, enrolled in two large 
high schools for two consecutive years as 
tenth and eleventh graders. All the students 
were used in identifying the factors, but the 


number of each sex rated by the teachers of 
English, social studies, science, and mathe- 
matics ranged from 29 to 38. 


Method 


As tenth graders, the students were admin- 
istered 18 tests. Fourteen of these were 
identical to or based upon tests devised by 
Guilford et al. (1956) ; four were devised by 
Bereiter (1959). Based upon a factor analy- 
sis of the tenth grade results, only eight of 
the tests were found to load significantly on 
factors common to both boys and girls. These 
eight tests were administered 12 months later 
to the students who remained in the two 
schools as eleventh graders. Three factors 
common to both sexes for both years were 
extracted, using Tucker’s interbattery method 
of factor analysis and other procedures devel- 
oped by Ethnathios (1960) and Harris. 

A brief description of the eight tests, in- 
cluding the related factors, is now given. 

1. Four-Word Combinations, Form A, an 
experimental test devised by Guilford et al. 
(1956), adapted by Bereiter (1959), and 
significantly loaded on the factor, expression- 
al fluency. The task was to produce four- 
word sentences, the first letter of each word 
being given; at least one word had to be 
varied in successive responses in order to be 
counted as a different response. 

2. Word Arrangement, devised by Guil- 
ford et al. (1956), adapted by Bereiter 
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(1959), and significantly loaded on the fac- 
tor, expressional fluency. The task was to 
produce sentences containing words which 
were supplied; in each sentence at least two 
words had to be interchanged to be counted 
as a different response. 

3. Object Naming, based on Guilford et 
al. (1956), adapted by Bereiter (1959), and 
significantly loaded on the factor, expression- 
al fluency. The task was to produce names 
for things; the names had to differ and could 
not be merely the same name preceded by 
different adjectives or articles. 

4. Plot Titles, devised by Guilford et al. 
(1956), adapted by Bereiter (1959), and 
significantly loaded on the factor, ideational 
fluency. The task was to produce titles for 
story plots which were supplied; each title 
had to have at least one main word different. 
The responses were also scored for originality 
by assigning a score of | to each response 
judged to be clever. 

5. Brick Uses, devised by Guilford et al. 
(1956), adapted by Bereiter (1959), and 
significantly loaded on the factor, ideational 
fluency. The task was to give uses of a brick; 
each nonidentical use was counted as |! 
response. 

6. Structural Functions, devised by Be- 
reiter (1959), and significantly loaded on the 
factor, ideationa! fluency. The task was to 
produce solutions to problems involving the 
structural properties of objects; each non- 
identical response was counted as | response. 

7. Product Design, devised by Bereiter 
(1959), and loaded on two factors—idea- 
tional fluency and figural ideational fluency. 
The task was to sketch in designs for car 
grills and lamp shades, the outlines of grills 
and shades being supplied; each nonidentical 
design was counted as | response. 

8. Alphabet Design, devised by Bereiter 
(1959), and loaded on the factor, figural 
ideational fluency, The task was to design 
new letters for a new alphabet; each new 
letter was scored | except in those instances 
where the student (a) turned the paper 
upside down, wrote the regular alphabet, and 
presented this as 26 new letters; or (6) pre- 
sented a series of shorthand symbols. The 
latter in toto were counted as | response. 

A composite score for each student on each 
of the three factors—expressional fluency, 
ideational fluency, and figural ideational 
fluency—was derived by changing the raw 
test scores to standard scores and applying 
the appropriate factor weight for each test 
score. 

As eleventh graders, the students were en- 


rolled in special classes for academically tal- 
ented students in English, social studies, 
science and mathematics; but not all students 
were enrolled in all four special classes. The 
teachers of the special classes met with mem- 
bers of the research team to arrive at defini- 
tions of creativity, fluency, and originality. 
Subsequently, they observed their students 
and rated each student on fluency and 
originality, using a 5-1 scale with 5 being 
highest and 1 lowest. The teacher ratings 
of fluency in each subject were correlated 
with the factor scores of expressional fluency, 
ideational fluency, and figural ideational 
fluency; the teacher ratings of originality 
were correlated with the test score of origi- 
nality. 

The definition of creativity generally ac- 
cepted was that proposed by Drevdahl 
(1956) : 

Creativity is the capacity of persons to 
produce compositions, products, or ideas 
of any sort which are essentially new or 
novel, and previously unknown to the pro- 
ducer. It can be imaginative activity, or 
thought synthesis, where the product is 
not a mere summation, It may inyolve the 
forming of new patterns and combinations 
of information derived from past experi- 
ence, and the transplanting of old relation- 
ships to new situations and may involve 
the generation of new correlates. It must 
be purposeful or goal directed, not mere 
idle fantasy—although, it need not have 
immediate practical application or be a 
perfect and complete product. It may take 
the form of an artistic, literary or scientific 
production or may be of a procedural or 
methodological nature (p. 22), 


Guilford’s (1959) definitions of fluency and 
originality were accepted; the criterion of 
originality was cleverness as judged by the 
teacher when rating the student and as 
judged by the writers when scoring the test, 
Plot Titles. 


RESULTS 


Table 1 presents the r’s for boys and 
girls in each subject. No 7r’s are reported 
for girls in mathematics because too few 
girls were enrolled in the special elev- 
enth grade mathematics classes. 

Fifteen of the 28 r’s are positive and 
significant at or beyond the .10 level: 
5 of 7 between ratings of fluency and 
scores of figural ideational fluency, 4 of 
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TABLE 1 


CoRRELATIONS BETWEEN Scores OF DiIvERGENT THINKING AND TEACHER RATING 


English 


Factor 


Social 
studies 


Expressional fluency 


Male 29 .41** 

Female 38 ll 
Figural ideational fluency 

Male 29 .34* 

Female 38 17 
Ideational fluency 

Male 29 41** 

Female 38 .29* 
Originality 

Male 29 .36* 


Female 


29 =—.15 29 | 
30 | 30 —.11 
29 05 29 .40** 
30 31* 30 35** | — — 
29 .08 29 02 29 20 
30 14 30 — 
29 46** | 29 .24 29 .24 


* Significant at .10 level. 
** Significant at .05 level. 
*** Significant at .01 level. 


7 between ratings of fluency and scores 
of expressional fluency, 3 of 7 between 
ratings of fluency and scores of ideation- 
al fluency, and 3 of 7 between ratings 
of originality and scores of originality. 

The significant r’s, according to the 
four subject fields, are 5 of 8 in English, 
5 of 8 in science, 3 of 8 in social studies, 
and 2 of 4 in mathematics. The signifi- 
cant r’s, according to sex and excluding 
mathematics, are 7 of 12 for boys and 
6 of 12 for girls. On only 3 of the 12 
sets for boys and girls are both r’s either 
significant or nonsignificant—figural 
ideational fluency and science (.70 boys, 
.35 girls) , ideational fluency and English 
(.41 boys and .29 girls), and ideational 
fluency and social studies (.08 boys and 
.14 girls). In the other 9 sets, one r is 
significant for girls but not for boys, or 
vice versa. 


DiIscussIoN 


All eight tests were loaded on the 
three common factors for boys as elev- 


enth graders; only four were for girls 
and four other specific factors emerged 
for girls (Ethnathios, 1960). The 1’s 
between teacher ratings and scores for 
boys and girls varied also, as noted pre- 
viously. Apparently, divergent thinking 
abilities differ substantially for the two 
sexes; and the expression of these abili- 
ties in the various subject fields also dif- 
fers markedly. 

All of the r’s between the English 
teachers’ ratings of fluency and the fac- 
tor scores were positive, and four of six 
were statistically significant. Four of six 
r’s were also significant in science, In 
social studies, one was negative and only 
two of six were significant, The elev- 
enth grade English classes were devoted 
to literature and composition; the sci- 
ence classes were chemistry; the social 
studies classes were American history. 
It is possible that the students in Ameri- 
can history had little opportunity to 
demonstrate fluency and the teachers 
did not rate the students reliably. It is 
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also possible that divergent thinking 
abilities are expressed in different forms 
in the various subject fields, that the 
abilities measured by the tests are ex- 
pressed more directly in English and in 
science than in history. 

Although devising reliable tests and 
submitting the scores obtained to factor 
analysis is essential to the identification 
and sorting of human abilities, the test 
scores are useless for decision making in 
schools if well-prepared teachers cannot 
recognize differences among student 
performances in the traits which the 
tests purport to measure. The writers 
do not consider the teachers’ ratings in 
this study to be an appropriate criterion 
of concurrent validity. However, they 
are skeptical of research which indicates 
wide variability among students in cre- 
ativity, when the variability is inferred 
solely from information gathered in test- 
ing situations of short duration. If re- 
searchers are unable to differentiate 


creativity in terms of the performances 
of students in nontest situations, it is 
possible that the differences are merely 
artifacts of the specific tests being used. 
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THE EFFECT OF CLASS ATTENDANCE AND “TIME 
STRUCTURED” CONTENT ON ACHIEVEMENT 
IN GENERAL PSYCHOLOGY ' 


PAUL W. CARO, Jr.? 
University of Tennessee 


The effects of attending class and of structuring an introductory 
psychology course with respect to time and content were studied, 
using 335 undergraduates in a 2 X 3 factorial design. Time structur- 
ing was accomplished through schedules of testing. Using end-of- 
course achievement test scores as the criterion, F ratios for neither 
the class attendance variable, the time structure variable, nor the 
treatment interaction were significant at the .10 level. It was con- 
cluded that students performed as well through independent study as 
in the conventional class situation, and time structured content was 
ineffective as a determiner of student achievement. Student dropouts 
and the number of students seeking individual assistance were unre- 


lated to the experimental treatments. 


During the past half century extensive 
research devoted to teaching method- 
ology has generally failed todemonstrate 
a relation between methods investigated 
and end-of-course achievement test per- 
formance. Student achievement also has 
been shown to be independent of the 
frequency of class attendance, although 
the assumption underlying virtually all 
this research is that something critical to 
learning is transmitted from teacher to 
student which the latter cannot gain by 
himself, Studies of this variable have 
eliminated classes for a period during 
the term (e.g., Milton, 1959), resched- 
uled a course to meet less frequently 
than normal (e.g., Fields, 1958), or 
established some form of tutoring (e.g., 
Guetzkow, Kelly, & McKeachie, 1954). 
Studies which substituted correspond- 
ence study, television, or film instruction 
for more traditional classroom relation- 


1This paper is based upon a PhD disserta- 
tion submitted to the Graduate Council of 
the University of Tennessee, 1961. 


2Now with the Training Section, Indus- 
trial Relations Department, Mead Corpora- 
tion, Chillicothe, Ohio. 


ships (e.g., Parsons, 1957) have revealed 
that the achievement of experimental 
and control groups was not significantly 
different, Even when all student-teach- 
er and student-student contact was 
eliminated (Parsons, Ketcham, & Beach, 
1958) there was no consistent superi- 
ority on posttest achievement by more 
traditionally instructed groups. 

It appears the means by which in- 
structional material is presented is un- 
important. The critical aspect may be 
the existence of a formal relationship 
itself. 

Another variable which is of possible 
influence in teaching efficiency is the 
extent to which the course content is 
structured in terms of time. Tests fre- 
quently are used as a convenient means 
of breaking the course content into units, 
each of which is tested at a different 
time. Improved student performance 
has been found to result from various 
uses of tests, but when tests are em- 
ployed without any indication that they 
were intended for instructional pur- 
poses, their usefulness has not been 
established. 
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Ross and Henry (1939) found con- 
flicting results between educational and 
general psychology courses when fre- 
quency of testing was the independent 
variable. Noll (1935) found no signifi- 
cant differences between midterm and 
final examination grades of students 
who received no other tests and those 
who received, in addition, four shorter 
tests at 4-week intervals. Keys (1934) 
administered the same tests in the form 
of short weekly tests to one and monthly 
tests to the other of his groups and 
found no differences in final examina- 
tion performance. Fitch, Drucker, and 
Norton (1951) found a statistically sig- 
nificant difference between weekly vs. 
monthly tested groups when other fac- 
tors were partialed out, but the weekly 
tests were used to guide their students in 
some manner, 

Whether the time structure provided 
by periodic testing per se enhances stu- 
dent achievement has not been estab- 
lished. An advantage for frequent test- 


ing is suggested in some instances. 

The present study compared achieve- 
ment of students who attended regular 
lecture-discussion classes with those who 


did not attend such class sessions 
throughout the term, and investigated 
student achievement under different 
amounts of time structure, as deter- 
mined by the schedule of testing. 


METHOD 


The research discussed in this report was 
accomplished during the second half of a two 
quarter course in introductory psychology. 
Two sections of the course were scheduled, 
meeting at 10:00 and 11:00 a.m. on Mon- 
days, Wednesdays, and Fridays. Hilgard’s 
introductory text (1957), supplemented by a 
workbook and three books of collateral read- 
ings, was used. 

The 335 undergraduate subjects had no 
knowledge of the study prior to the first class 
meeting, at which time they were given in- 
structions appropriate to their group. At 
registration they were given the option of 


either section and were then assigned ran- 
domly to one of three subsections meeting at 
that hour. They had no option with respect 
to experimental treatment. 

The study was conducted as a 2 X 3 fac- 
torial analysis of variance (Lindquist, 1953, 
Chapter 9). The two-point variable was 
class attendance, and the three-point variable 
was the amount of time structure provided 
the course content. 

Two extremes of class attendance were 
used. These were required attendance at all 
scheduled class meetings and no classes 
scheduled. 

Three degrees of time structure, represent- 
ing the practical extremes of this variable as 
well as a midvalue, were used. These were 
the assignment of the entire course content 
as a single unit, four separate and nonover- 
lapping assignments corresponding to four 
major units of the textbook, and 25 non- 


overlapping assignments corresponding to 


the number of scheduled class sessions. An 
appropriate test was administered at the 
end of each assignment period covering that 
assignment only. 

The daily tests administered to the maxi- 
mum structure groups were of approximately 
5-minutes duration. The unit tests admin- 
istered to the intermediate structure groups 
were approximately 10 times as long. The 
class groups were administered tests as re- 
quired by the experimental design and spent 
the remaining class time in lecture-discussion 
situations. The no class groups attended class 
to take appropriate tests and were then dis- 
missed. There was no other class activity for 
these students, No information concerning 
test scores or student progress was given any 
individual or group. 

Seven professors and graduate students 
served as lecturers and rotated among class 
groups in such a manner that each lecturer 
had equal exposure to each group. The 
classes were generally conducted as lectures 
with students free to raise questions. 

The criterion of achievement was perform- 
ance on a 100-item multiple-choice test ad- 
ministered as a final examination on a Tues- 
day night in the university’s gymnasium. The 
investigator, who had administered all pre- 
vious tests, did not participate. Achievement 
was determined by the number of correct 
responses on this examination. 

So that all students might have similar ex- 
pectations with respect to the criterion exam- 
ination, at the beginning of the term each 
student was given orally and in printed form 
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specific examples of items, content, and scor- 
ing procedures which would constitute the 
final examination. Items over specific con- 
tent which had been covered by the tests 
administered to the maximum and the inter- 
mediate structure groups during the study 
were excluded from the examination, as were 
all items used during prior terms. Ten 
“warm-up” items preceded the examination, 
and six forms, with the item positions ran- 
domized between forms, were employed. To 
help alleviate student anxiety, comments 
were solicited as suggested by Teevan and 
McKeachie (1954). 

The items contained in the criterion ex- 
amination were answerable from information 
contained in the assignments made to all 
students regardless of the group to which 
they were assigned. Twenty of the items, 
however, were specific to portions of the text- 
book which had been deliberately stressed 
and/or elaborated upon by the lecturers be- 
fore the three class groups. 


RESULTS 


The mean difference in performance 
between the class and the no class groups 
on the 20 criterion examination items 
covering material stressed in class was 
.133, which was not significant at the .25 
level of confidence (t = .455). The reli- 


ability of this subtest, estimated by Kud- 
er-Richardson Formula 20, was .437. 

In view of the lack of a significant 
difference in performance between the 
class and no class groups on these 20 
items, the scores obtained on the entire 
100-item criterion examination were 
used for further analyses. The reliability 
of these 100 items, estimated by Kuder- 
Richardson Formula 20, was .845. The 
means and standard deviations of cri- 
terion scores for each group are con- 
tained in Table 1. The F ratios for 
neither the class attendance variable 
(F = 1.10), the time structure variable 
(F = 1.13), nor the treatment interac- 
tion (F = 1.64) were significant at the 
.10 level of confidence. 

Fifteen students dropped the course 
and are not included in Table 1. The 
relation between dropouts and treat- 
ment groups was not significant at the 
.05 level of confidence (x? = 5.775, 
df = 2). 

Throughout the quarter there were 
six students who made an office visit for 
the purpose of asking for assistance with 
a specific assignment, although instruc- 


TABLE 1 


MEANS AND STANDARD DeviATIONS OF CRITERION EXAMINATION SCORES AND THE 
NuMBER OF StuDENTs IN Eacn EXPERIMENTAL TREATMENT COMBINATION 


Amount of time structure 


Treatment 
combination Minimum 


Intermediate Maximum 


Class 


M 49.520 53.105 49.489 
SD 11.176 11.592 11.518 
50 57 47 


48.633 49.140 50.950 
11.024 11.401 10.396 
49 57 60 


49.081 51.123 50.308 
11.053 11.618 10.932 
99 114 107 
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Toral 
50.838 
11.532 
154 
No class 
M 49.645 
SD 10.915 
N 166 
Total | 
M 50.219 
SD 11.214 a 
N 320 
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tors were available during 37 weekly 
office hours. Another 13 made office 
visits to discuss matters which were not 
specific to the course. The relation be- 
tween office visits and treatment groups 
was not significant at the .10 level of 
confidence (x? = 4.380, df = 2). 


DISCUSSION 


An interesting aspect of the present 
study was the similarity of achievement 
by students attending daily lecture-dis- 
cussion sessions and those not attending 
classes on the 20 criterion items which 
might have been considered on a priori 
grounds to favor the group attending 
class. Because of the low reliability of 
these items, it can be stated only that 
this study failed to demonstrate the use- 
fulness of the classroom presentation of 
materials which were available to stu- 
dents in printed form. This result must 
be cautiously labeled as suggestive, and 
further investigation is clearly indicated. 

More confidence can be placed in the 
factorial analysis using the 100-item cri- 
terion examination with its reliability 
coefficient of .845. The null hypotheses 
of no differences between groups with 
respect to class attendance and time 
structure cannot be rejected. This find- 
ing confirms and extends the results of 
studies previously reported with respect 
to the influence of class attendance, and 
the ineffectiveness of time structure 
tends to emphasize the importance of 
knowledge of results and reinforcement 
in the achievement of college students. 

If teachers are willing to accept 
achievement on the final course exami- 
nation as both a valid and a reliable 
criterion, it would appear that students 
can be expected to achieve as well in 
introductory psychology courses by 
studying the appropriate assignments 
“on their own” as they can with what- 
ever assistance may be provided through 
class attendance. Unless some content 


or technique demonstration is provided 
in class which is not otherwise available, 
the purpose of requiring class attend- 
ance for college students is not evident 
in this study, 

Unsolicited statements from students 
and scores on the tests administered to 
the maximum and the intermediate 
structure groups lead the investigator to 
believe that the use of tests was effective 
in structuring the course in terms of 
both content and time, It is therefore 
assumed that, within the limits of the 
duration of the present study and the 
content involved in the course, the 
ineffectiveness of time structured con- 
tent as a determiner of achievement has 
been demonstrated. 

The foregoing should not be inter- 
preted as indicating the ineffectiveness 
of testing as a procedure for enhancing 
achievement. Important motivational 
and instructional values of tests were 
eliminated. Tests were used only as a 
means of structuring the course with 
respect to time and content. In this 
role, they did not improve student 
achievement, 

The number of dropouts, which was 
not inconsistent with dropouts in similar 
courses, and the number of students 
seeking assistance with assignments, 
were interpreted as behavioral evidence 
that there was no general lack of con- 
fidence on the part of students that they 
could perform in an acceptable manner 
when made solely responsible for their 
own achievement. 
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ATTITUDINAL RIGIDITY AS A MEASURE OF 
CREATIVITY IN GIFTED CHILDREN 


ELYSE S. FLEMING 
Western Reserve University 


SAMUEL WEINTRAUB: 
University of Chicago 


The performance of 68 academically talented elementary school chil- 
dren on a battery of verbal and nonverbal creativity tests was 
correlated with performance on a paper and pencil test purporting 
to measure attitudinal rigidity. A moderate negative relationship 
(r = —.41) was found to exist between rigidity and verbal creativity 
only. Investigation of the relationship of chronological age, intelli- 
gence, and sex indicated that while neither sex nor intelligence were 
significant factors, chronological age appeared to be related to verbal 
creativity production. Refinement of the rigidity measure seems indi- 
cated as an administratively feasible technique for the rough screening 


of verbally creative children. 


Most empirical evidence in the area 
of creativity has thus far been largely 
concerned with the identification of this 
attribute in older children and adults. 
The need for objective measures in the 
early identification of creativity in the 
gifted elementary school child is appar- 
ent. Currently Torrance and his co- 
workers (Torrance, Yamamoto, Sche- 
nitzki, Palamutlu, & Luther, 1960) have 
pioneered in the development of cre- 
ativity tasks suitable for use with young 
children. 

Investigations (Taylor, 1958) with 
adolescents and adults have demon- 
strated the interrelatedness of creativity 
and personality traits. In describing the 
personality characteristics of creative 
adolescent art students, Hammer (1961) 
has noted that these talented adolescents 
may be differentiated from merely facile 
students on the basis of personality dy- 
namics. Among the specific personality 
traits found to differentiate between the 
creative and the noncreative are impul- 
siveness, self-confidence, tolerance of 
ambiguity, and less need for discipline 
and orderliness (Guilford, Christensen, 


1Formerly at Western Reserve University. 


Frick, & Merrifield, 1957). Barron 
(1958) verified disorderliness and tol- 
erance for the chaotic in the creative 
adult. 

An investigation (Fleming, 1956) 
into the nature of attitudinal rigidity 
identified this personality trait in ele- 
mentary school children by means of an 
objective measure. The construct of 
rigidity is often defined in terms of in- 
flexibility, stereotypy, intolerance of am- 
biguity, and a compulsive need for 
order, terms which represent antonyms 
of the most widely used definitions of 
creativity. Therefore, the attitudinal 
rigidity construct was thought to be a 
worthwhile avenue of exploration in 
approaching the creativity continuum 
inversely. 

The present study was an exploratory 
one, designed to investigate the rela- 
tionship between verbal and nonverbal 
creative task performance and the per- 
sonality dimension of rigidity or the 
intolerance of ambiguity in gifted ele- 
mentary school children. Subsumed 
under this general purpose is included 
the determination of the usefulness of 
the rigidity concept in_ identifying 
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aspects of creativity. Secondarily, the 
intent was to examine relationships be- 
tween verbal and nonverbal creativity 
and several other factors, such as age 
and intelligence in academically tal- 
ented children. 

Inasmuch as earlier work (Barron, 
1958; Guilford et al., 1957) relating 
personality factors, including the rigidity 
construct, to creativity has employed 
adult populations, the present effort was 
directed toward examining the applica- 
bility of these findings to children, Fol- 
lowing the work of Getzels and Jack- 
son (1960) and Torrance (Torrance et 
al., 1960) whose efforts have been with 
gifted children, the current study pur- 
ported to further describe this segment 
of the population. 


METHOD 
Subjects 


The subjects were 68 elementary school 
children who had just completed Grades 3, 
4, 5, and 6 and were enrolled in a summer 
demonstration school for academically tal- 
ented children. The children were selected on 
the basis of teacher recommendation, and IQ 
scores and achievement data where available. 
Subjects ranged in age from 8 to 124% years. 
The Pintner General Ability Test, Verbal 
Series, Intermediate Form was administered 
after admission to the program. Table 1 
shows the mean ages and IQs by grade level. 


TABLE 1 


Ace anv IQ or ELEMENTARY SCHOOL 
SampLe AccorDING TO GRADE 


Mean 
age (in 
months) 


Grade 


107.63 
120.94 
132.50 
144.38 
Total 


sample 125.46 


Measures 


A battery of verbal and nonverbal creative 
thinking tasks was assembled from the Tor- 
rance Compendium (Torrance et al., 1960). 
The six verbal tasks included Impossibilities, 
in which the child is instructed to list as 
many impossibilities as occur to him; Con- 
sequences, in which the child is asked to 
respond to three questions, such as “What 
would happen if man could become invisible 
at will?”; Situations, in which the child must 
react to three problematic situations, such as 
“If all schools were abolished, what would 
you do to try to become educated ?” ; Unusual 
Uses, in which the child lists as many un- 
usual uses of tin cans as he can; Common 
Problems, in which the child is asked to list 
problems which might be associated with 
doing homework and taking a bath; Im- 
provements, in which suggestions are elicited 
as to ways of improving three common ob- 
jects, e.g., bicycles. A 5-minute time limit 
obtains for each of the six verbal tasks. 

The nonverbal battery included the follow- 
ing three tasks: Picture Construction, in 
which a colored curved shape is presented 
and directions stipulate drawing a picture 
using the shape as the basis for composition; 
Incomplete Figures ‘Test, in which six-line 
stimuli are presented as the basis for creative 
responses; the Circles Test, in which the 
subjects are requested to react to 42 circles 
by using each circle as the nucleus for an 
original idea. A 10-minute time limit pre- 
vails for each of the three nonverbal tasks. 

The Modified Revised California Inven- 
tory (Fleming, 1956) was used as a measure 
of attitudinal rigidity or the intolerance of 
ambiguity. This is a 60-item paper and 
pencil test originally developed by Frenkel- 
Brunswik and revised downward by one of 
the present authors for use with elementary 
school children. Preliminary validation indi- 
cates a point-biserial correlation of .484 with 
the Luchins Maze Test for Einstellung effect 
(Fleming, 1956). 


Procedure 


Each of the four classes was administered 
all tests by the classroom teachers after a 
brief orientation by the investigators. Ad- 
ministration was spaced over a 3-week period. 
Because of the subjectivity of the scoring of 
the verbal and nonverbal creativity tasks, the 
investigators jointly scored all of the test 
protocols in an effort to increase scoring 
reliability. Each of the creativity subtests 
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was scored on three indices: Ideational 
Fluency, the number of relevant responses 
less the number of redundancies; Spontane- 
ous Flexibility, the number of categories 
represented; Originality, the uniqueness of 
the responses. Two total scores were derived 
comprised of the combined originality, flexi- 
bility, and fluency scores for both the verbal 
subtests and the nonverbal subtests as sug- 
gested by Torrance.* 


RESULTS 


Table 2 shows the means and stand- 
ard deviations of the verbal and non- 
verbal creativity tests and of the rigidity 
test for the four grade levels represented. 
A sizable increase in the verbal cre- 
ativity means may be noted between 
Grades 3 and 4 with a somewhat les- 
sened drop occurring between Grades 
5 and 6. There is an increase in the 
variability of the fourth grade group. 
The nonverbal creativity means peak at 
Grade 4 but remain fairly constant for 
Grades 5 and 6. The developmental 
pattern of rigidity scores indicates a 
decrease with age through Grades 5 
with a slight rise for the Grade 6 group. 
This may be a function of the higher 
intelligence of the fifth grade group 
sampled. 


2E. Paul Torrance, personal communica- 
tion, July 1961. 


Correlations were obtained between 
verbal and nonverbal creativity tasks 
with each other and with rigidity, 
chronological age, and IQ. Table 3 
summarizes these findings. The highest 
Pearson correlation coefficient was ob- 
tained between the two creativity tasks 
themselves, while a moderate degree of 
negative relationship existed between 
verbal creativity performance and rigid- 
ity score. No statistically significant 
relationships were found between non- 
verbal creativity and rigidity. As was 
anticipated because of the relative ho- 
mogeneity of the group with respect to 
intelligence, no relationship was found 
between either verbal or nonverbal cre- 
ativity and IQ. Age similarly did not 
appear to be a factor related to cre- 
ativity as herein measured. 

Because of the inverse relationship 
found to exist between rigidity and 
verbal creativity, a decision was made 
to explore further the components of 
verbal creativity as they related to rigid- 
ity. As may be noted from Table 3, 
ideational fluency, spontaneous flexi- 
bility, and originality bear a similarly 
negative relationship to rigidity but one 
of less magnitude than the combined 
total score. 

Inasmuch as a ¢ test between the 


TABLE 2 


MEANS AND STANDARD DeviaTIONs OF CREATIVITY BATTERIES AND 
Riciwity Test sy Grape Levets 


Nonverbal Creative 


3 

= | Creative Verbal | 
tasks Rigidity 

M | sD | M SD mM | sp 

3 19 64.11 17.61 44.63 13.34 2289 | 4.65 ii 

- 4 17 130.18 52.69 59.82 15.73 20.35 | 6.83 ‘G 

: 5 16 137.81 32.40 47.94 14.12 13.81 | 4.12 > 

i 6 16 107.13 38.24 47.31 17.09 16.13 | 3.74 | 
Total 
‘ sample 68 108.09 47.33 49.84 16.18 18.53 | 6.26 
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TABLE 3 


CorRELATIONS OF CREATIVITY BATTERIES 
Eacu OTHER, WiTH VERBAL ComM- 
PONENTS, Ricmpiry, CA, anp IQ 


Nonverbal 


Variables Rigidity* Creativity CA IQ 


53° 08 .04 


Verbal Creativity 
Verbal Originality 
Verbal Fluency —.40* 

Verbal Flexibility —.32* 

Nonverbal Creativity —.15 02 .06 
CA —.50* 


* Partial r (age held constant) between rigidity and 
verbal creativity — —.43 


* Significant at the .01 level. 


means on verbal creativity of the 40 
boys and 28 girls showed no statistical 
significance (critical ratio = .065), it 
was decided to omit this factor from 
further analysis. 


Discussion AND CONCLUSIONS 


The moderate inverse relationship 
found between verbal creativity scores 
and attitudinal rigidity scores suggests 
that this approach may warrant further 
investigation for purposes of identifying 
creative, gifted elementary school chil- 
dren. It may be noted that the rigidity 
scale correlates with creativity in the 
same order of magnitude, albeit nega- 
tively, as do creativity subtests with each 
other (Piers, Daniels, & Quackenbush, 
1960). The present subjective and 
time consuming scoring procedure for 
creativity indices mitigates against their 
widespread use as an instrument in the 
repertoire of educators and psycholo- 
gists. If verbal creativity can be assessed 
indirectly through a relatively simply 
scored objective attitudinal measure, 
perhaps greater numbers of children 
could be screened for later more inten- 
sive assessment. Measurement in the 


creativity and rigidity areas is in its 
infancy and rigorous effort in valida- 
tion must occur before any definitive 
conclusion can be drawn, however. Fur- 
ther refinement of the attitudinal rigidi- 
ty scale toward this end seems indicated. 


The fact that verbal and nonverbal 
creativity tasks were found to be only 
moderately related to each other sug- 
gests that while there may be common- 
alities, they may also be measuring dif- 
ferent aspects of the creativity attribute, 
an hypothesis which may help to ac- 
count for the lack of relationship found 
between nonverbal creativity and atti- 
tudinal rigidity. The present study can 
only raise this as another unanswered 
question. 

At the fourth grade (end of third 
grade) there is a drop in verbal cre- 
ativity which corresponds with results 
found by Torrance (Torrance et al., 
1960). Partialing out age had no effect 
on the magnitude of the negative rela- 
tionship between verbal creativity and 
rigidity (r23 = —.43). The investiga- 
tors postulate that in the present study 
at least, the skewed results may be a 
function of the more mature physical 
coordination reflected in the mechanics 
of writing as well as the increased ex- 
periential background of the older chil- 
dren. If, in fact, younger children are 
penalized on verbal creativity tests by 
virtue of limited experience and physi- 
cal agility, then the argument for the 
use of an objective type instrument is 
further strengthened. 

If the creative individual does indeed 
live closer to the inner reaches of his 
personality, then an approach to the 
assessment of creativity potential may 
be made through better understanding 
of personality dynamics, Particularly 
may this be the case with younger chil- 
dren in whom the overlay of culture is 
readily permeable. 
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SERIAL ANALYSIS OF VERBAL ANALOGY PROBLEMS' 


DONALD M. JOHNSON 
Michigan State University 


A serial exposure method for separating a problem solving episode 
into 2 phases is applied to analogy problems constructed so that the 
difficulty can be attributed to either the inductive operation of the 
Ist phase or the deductive operation of the 2nd phase. When 25 
problems of each type were solved by 60 college Ss, the time differ- 
ences clearly supported this distinction. The problems were presented 
in 3 formats named, according to the second phase: multiple choice, 
initial letters, and production. In respect to time of the 2nd phase, 
initial letters resembled multiple choice, negating the hope that this 
format required a production process. 


A serial exposure method has been 
devised for separating problem solving 
episodes into two periods, characterized 
by different problem solving processes, 
and the results obtained describe the 
sequence of intellectual operations in 
the solution of certain verbal and figure 
concept problems (Johnson, 1960, 
1961). The present investigation studies 
the verbal analogy problem because it is 
a standard problem, widely used in psy- 
chological research, and because its logi- 
cal properties generate testable hypoth- 
eses about problem solving processes. 

Consider the analogy: feline is to 
canine as cat is to ? Finding the rela- 
tion between the first pair of words is 
called induction and applying this rela- 
tion to the second pair is called deduc- 
tion. Since feline and canine are less 
familiar than cat and dog, we assume 
that the principal difficulty in the solu- 
tion of this problem lies in the inductive 
operation. The deductive operation 
would be the locus of the difficulty in 
this problem: lose is to win as liability 
is to ? If this assumption is correct, 
relatively more time will be spent on 
preparatory study of the first pair of 


1Supported in large part by a grant from 
the National Science Foundation. 


words for those problems that emphasize 
induction than for those that emphasize 
deduction. 

Analogy problems can be presented 
in several formats. The production or 
completion type, illustrated above, re- 
quires the subject to produce the solu- 
tion. In the multiple-choice type alter- 
native solutions are displayed from 
which the subject makes a selection. We 
would expect that the difference be- 
tween these two types would consist 
principally in the time of the second 
period since the former requires the sub- 
ject to produce his deduction while the 
latter offers him a choice. 

Another format, used in several ACE 
psychological examinations (ACE, 1945) 
as a test of verbal comprehension, may 
be called the initial-letters type. Instead 
of words, as in multiple choice, several 
letters are presented, one of which is 
the initial letter of the correct solution. 
It may be adapted to the analogy prob- 
lem thus: feline is to canine as cat is 
to a, b, c, d, e. This type is worth 
special investigation because it seems to 
combine the advantages of both the 
above types by requiring some produc- 
tion of solutions while retaining the 
scoring convenience of the multiple- 
choice format. It may be assumed that 
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the subject will produce the solution, 
then identify it by selecting the correct 
initial letter, but it is also possible that 
the letters supply cues which reduce the 
need for production. In either case, one 
would expect that objective data from 
serial analysis would place this type of 
problem between the other two. 


METHOD 


Twenty-five analogy problems were con- 
structed with the emphasis on induction and 
25 with the emphasis on deduction, and all 
50 were given in irregular order to three 
groups of 20 subjects each. The subjects 
were students from elementary psychology 
classes. As in previous experiments the 
material was divided into two parts and 
exposed serially behind a half-silvered mirror. 
Both exposures were controlled by the subject 
and timed electrically. The preparatory ex- 
posure, Consisting of the first pair of words, 
was the same for all three groups. For the 
production group the second exposure 
showed only the first word of the second 
pair. When the subject finished studying the 
first pair, he took a pen from a specially built 
penholder switch which turned off the first 
exposure, turned on the second exposure and 
illuminated a pad of paper on which the 
subject wrote his solution. Replacing the 
pen in the holder terminated the second 
exposure. 

For the multiple-choice group the second 
exposure included the first word of the 
second pair and five numbered words as 


alternative solutions. When the subject fin- 
ished studying the first pair of words, he 
turned a toggle switch which turned off the 
first exposure and turned on the second. He 
then chose one of the solutions and pressed 
one of five numbered buttons, thus register- 
ing his choice and terminating the second 
exposure. 

For the initial-letters group the second 
exposure presented only the first word of the 
second pair, but the subjects were informed 
that the solution to each problem was a 
word beginning with a, b, c, d, or e, and 
the buttons were so labeled. The procedure 
was otherwise the same as for multiple choice. 

Thus for each problem a record was ob- 
tained of the time spent studying the first 
pair of words, called preparation time, the 
time spent on the second period, and the 
solution written or selected. 


RESULTS 


Table 1 shows that the preparatory 
period during which the inductive proc- 
ess presumably occurs, is definitely long- 
er for problems intended to emphasize 
induction than for those intended to 
emphasize deduction. All three differ- 


ences are significant at the .01 level. 
The differences are reversed in the sec- 
ond period, as expected, and these dif- 
ferences are significant at the .01 level 
for production and initial letters. Total 
times are about the same for the induc- 
tion and deduction problems except for 


TABLE 1 


MEAN Time, IN SECONDS PER PROBLEM, AND FREQUENCY OF ERRORS PER 
ProBLeM FOR SOLUTION OF Stx Types OF ANALOGY PROBLEM 


Preparation 
Type of problem i 


Second 
period 


Total 


Preparation 
time i 


index 


Production 
Induction 
Deduction 

Initial letters 
Induction 
Deduction 

Multiple choice 
Induction 
Deduction 


36 
.23 


44 
31 


28 
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multiple choice, in which case the dif- 
ference is significant at the .01 level. 

A compact way of summarizing these 
differences in time spent on the two 
periods is the preparation index, ob- 
tained by dividing preparation time by 
total time to get the proportion of the 
whole that is spent on the first period. 
In the three comparisons this index is 
larger for problems of induction than of 
deduction, all significant beyond the 
.01 level. Hence timing these two periods 
separately yields results consistent with 
logical expectations. 

To obtain a measure of errors for 
solutions to the production problems 
scoring criteria were applied which re- 
quired that the word be of correct gram- 
matical form and that a standard mean- 
ing be used. For the initial-letters and 
multiple-choice types a simple count of 
errors was used, with no correction for 
chance. The relative frequencies of 
errors obtained by these procedures are 
shown in Table 1. It is apparent that 
the induction and deduction problems 
were approximately equal in difficulty. 
In none of the three comparisons is the 
difference significant. 

Turning now to comparisons of the 
types of format, it is apparent that in 
respect to time spent on preparation 
the three types are about the same. (In 
only one of the six comparisons, initial- 
letters deduction against multiple-choice 
deduction, is the difference significant. ) 
This is to be expected because the words 
exposed are the same. The second 
period is significantly longer for produc- 
tion problems than for the other two 
types because of the time taken to write 


a word. More important is the fact that 
the second period is not significantly 
longer for the initial-letters format than 
for the multiple-choice format—either 
for induction or deduction problems. In 
view of this comparison it would be 
hard to maintain that the initial-letters 
format demands a production process of 
any consequence. The alternative hy- 
pothesis that cues supplied by the initial 
letters reduce production to a minimum 
is more plausible. 

In general, these results confirm the 
logical assumption that, when analo- 
gies are constructed so as to emphasize 
induction, the inductive operation re- 
quires relatively more time, and, corre- 
spondingly, when deduction is empha- 
sized, the deductive operation requires 
more time. This confirmation supports 
the validity of serial analysis of prob- 
lem solving processes. Description of 
the three types of problem by serial 
analysis of the two periods of time puts 
the initial-letters type between the pro- 
duction type and the multiple-choice in 
most respects, as expected, but there is 
no support for the hope that the initial- 
letters format requires a productive 
process of any consequence. 
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THE EFFECT OF OVERT VERSUS COVERT RESPONDING 
TO PROGRAMED INSTRUCTION ON IMMEDIATE 
AND DELAYED RETENTION' 


RONALD G. WEISMAN 


JOHN D. KRUMBOLTZ? 
Michigan State University 


Stanford University 


To test the effect of overt vs. covert responding in programed instruc- 
tion, 54 undergraduates in educational psychology were randomly 
assigned to 4 groups: a group who wrote down each response, a 
group who “mentally composed” each response, a group who read 
the program in which the blanks were already filled, and a control 
group who wrote their answers to a completely different program of 
about the same length. A 50-item test was administered following 
the study period, and an alternate form 2 weeks later. The 3 response 
mode groups did not differ significantly on the Ist test. However, 
on the delayed test the written response group scored significantly 
higher than the other 2 groups, The control group scored signifi- 
cantly lower on both tests. Thus, overt responding appears to increase 


delayed retention. 


The importance of the overt response 
to learning has been emphasized by 
many learning theorists. This emphasis 
has been incorporated in programed 
learning materials which instruct the 
student to write out his answer (e.g., 
Holland & Skinner, 1961, p. viii). 

However, only one study so far has 
supported the position that overt re- 
sponding is necessary for efficient learn- 
ing (Holland, 1960). Students who 
wrote out their responses to a standard 
program made fewer errors on a cri- 
terion test than a group who read the 
same material as complete statements. 
More important is the fact that groups 
whose programs required them to write 
out trivial responses or responses to dif- 
ficult and ambiguous items made about 
as many errors as the reading group. 
Although variances within groups and 


1This research was supported by a grant 
from the United States Office of Education, 
Educational Media Branch, under Title VII 
of the National Defense Education Act. 

2Formerly at Michigan State University 
where this research study was conducted. 
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tests of significance were not reported, 
these conclusions have been supported 
by a replication of this study which 
yielded essentially the same _results.* 
Written responses would appear to be 
a necessary, though not sufficient, condi- 
tion for efficient learning according to 
this evidence. 

Other studies recently reviewed 
(Krumboltz, 1961) have failed to reveal 
any superiority for overt responses. 
Silverman and Alter (1960) reported 
that students who read a program on 
basic electricity without overtly respond- 
ing scored significantly higher on an 
8-item fill-in test than those who filled 
in the blanks in the program. An ex- 
periment by Goldbeck, Campbell, and 
Llewellyn (1960) which compared the 
effectiveness of four modes of respond- 
ing (overt, covert, optional overt, and 
reading) revealed no significant differ- 
ences among the criterion means of the 
four response modes. 


8James G. Holland, personal communica- 
tion. 
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Evans, Glaser, and Homme have re- 
ported two studies (1960a, 1960b) in 
which overtness of response was investi- 
gated, In neither case were significant 
differences found, although the small 
number of cases in each treatment 
group did not allow for a very powerful 
test of the hypothesis. Roe, Massey, 
Weltman, and Leeds (1960) found 
that a programed textbook which re- 
quired no overt responses took signifi- 
cantly less time to complete and resulted 
in a not significantly higher score on the 
criterion test than did the programed 
textbook which required written re- 
sponses. 

Although most studies so far reported 
have failed to support the importance 
of overt responding, the question should 
not be considered settled. Short pro- 
grams with unknown error rates, brief 
criterion tests, and small Ns make it 
difficult to reject the null hypothesis. In 
addition the criterion tests in previous 
studies were always administered im- 
mediately after the completion of the 
program, and no delayed measures of 
retention were obtained. 


METHOD 
Subjects 


Fifty-four undergraduates on whom com- 
plete data were available constituted the 
sample, They were members of an educa- 
tional psychology class enrolling 86 students 
during the summer of 1961. 


Programed Materials 


The programed textbook used in this ex- 
periment was designed to teach prospective 
teachers some fundamentals of educational 
test interpretation. Its 177 frames were 
planned to give students a conceptual under- 
standing (not computational proficiency) of 
percentiles, age and grade scores, norms, 
normal distribution curves, standard devi- 
ations, and z scores. The program had been 
completely revised once after an earlier ver- 
sion had been tried out on 80 students. A 
tally of the responses to each frame and to a 
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criterion test provided the necessary data 
for revision. The revised program was found 
to have an average error rate of 109% on an 
equivalent sample of 180 educational psy- 
chology students. 


Procedure 


Students were randomly assigned to four 
treatment groups: (a) the written response 
group was instructed to write down their 
responses to the program on separate sheets 
of paper and to score each response as right 
or wrong after turning the page to the cor- 
rect answer; (b) the covert response group 
was instructed to “mentally compose” a re- 
sponse to each blank before turning the page 
to the correct answer but not to write down 
any answer; (c) the reading group was 
instructed simply to read the program which 
was identical to the others except that the 
correct answers were written in the blanks 
instead of appearing on the following page; 
(d) the control group was given a completely 
different 150-frame program designed to 
teach them how to write valid questions for 
classroom tests. 

The randomization was accomplished by 
arranging the four sets of booklets in a ran- 
dom order before handing them to students. 
The regular instructor in the class handed 
out and collected all programed materials 
and criterion tests endeavoring to present 
the material as a regular part of the class- 
room activities. The differing programs were 
explained to students frankly as an attempt 
to try out different versions of this new kind 
of learning material. 

All groups were given the materials to take 
home and use for 3 days. They were told a 
test would be given on the fourth day al- 
though it would not count toward their final 
grade. On the fourth day all programed 
booklets were collected. Answer sheets were 
collected from the written response group 
and the control group. As a check on whether 
students in the first three groups actually 
used their programed booklets, there were 19 
places spaced throughout the booklets where 
the following special instruction appeared: 
“If you have read this far, mark an X here 
( ).” Any student who marked less than 15 
of these 19 places was eliminated from the 
analysis. Six of the 60 subjects were thus 
eliminated. 


4The authors are indebted to Ben Thomp- 
son for substantial assistance and helpful 
suggestions. 
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Criterion Tests 


Two alternate forms of the criterion test 
were developed. Each was a 50-item com- 
pletion test designed to measure students’ 
understanding of the concepts explained in 
the 177-frame program on interpreting edu- 
cational tests, All questions involved the 
application of concepts to new situations and 
did not duplicate frames in the program. 
Form A was administered on the fourth day 
of the experiment immediately after the 
3-day study period. Form B was adminis- 
tered 2 wecks later. The corrected split-half 
reliability was estimated to be .85 for Form 
A and .75 for Form B. The correlation be- 
tween the two forms was .85. All correla- 
tions were based on groups who had pre- 
viously studied some version of the relevant 
program; thus, the control group was ex- 
cluded te prevent an inflated reliability 
estimate. 


RESULTS 


The means and standard deviations 
on both immediate and delayed reten- 
tion tests for all groups are presented in 
Table 1. A randomized groups analysis 
of the immediate test scores yielded evi- 
dence supporting the overall hypothesis 
that the treatments had differential 
effects (F = 5.976; df =3/50, p < .005). 
Since only comparisons between the 
written response group and the other 
three treatments were of interest, the 
t test was judged appropriate for the 
three individual comparisons. Only the 
comparison with the control group was 


significant (¢ = 10.63; df = 20, p < 


.001). The immediate criterion test 
mean of the written response group did 
not differ significantly from the mean of 
either the covert response group (t = 
1.21) or the reading group (t = 1.10). 
Thus, if the results of the immediate 
criterion test only were considered, the 
hypothesis that a written response is 
superior to covert or reading responses 
in the programed material would not be 
supported, 

However, analysis of the criterion test 
administered after a 2-week interval 
produced a different picture. An analy- 
sis of variance of the delayed test yielded 
evidence that the treatments had differ- 
ential effects (F = 2.978; df = 3/50, 
p < .05). Selected individual compari- 
sons of the written response group with 
the other three treatments were all sig- 
nificant. The written response group 
was significantly higher than the covert 
response group (¢ = 2.33; df = 23, 
p < .05), the reading group (t = 2.44; 
df = 25, p < .05), and the control 
group (¢ = 8.38; df = 20, p < .001). 
Thus, the results from the delayed cri- 
terion test supported the hypothesis that 
a written response facilitates retention 
of programed material. 


DISCUSSION 


Written responses to programed ma- 
terials were found to aid in the retention 


TABLE 1 


MEANS AND STANDARD DEVIATIONS FOR TREATMENT GROUPS ON 
IMMEDIATE AND Detayep RETENTION TESTS 


Written response 
(N = 10) 
Retention test 


Covert response 
(N = 15) 


Control 
(N = 12) 


Reading 
(N = 17) 


SD 


SD sD 
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7.69 
7.58 


Immediate 33.90 
Delayed 30.30 
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of new learning when measured after a 
2-week interval although an immediate 
test of learning failed to show any differ- 
ences. Such a result would lend support 
to those who have insisted on the im- 
portance of an overt response, and 
would partially explain why other in- 
vestigators have been unable to find a 
significant difference between overt and 
covert responding. The advantages of 
overt responding may not become clear- 
ly apparent until some time after the 
study period, although the exact pa- 
rameters of retention over time need 
investigation. 

A hint of this result was provided in 
the study by Goldbeck, Campbell, and 
Llewellyn (1960). Although the mean 
scores on the criterion test were not sig- 
nificantly different, the average amount 
of time required by each group to com- 
plete the criterion test did differ sig- 
nificantly. The reading group required 
the longest time on the criterion test 
while the written response group re- 
quired the shortest time—directly the 
reverse of the relative amount of time 
taken to complete the program itself. 
This suggests that the overtly respond- 
ing group justifiably developed more 
confidence in the correctness of their 
answers, and a delayed test of retention 
might have revealed a difference not 
immediately apparent. 

It is not suggested that students mak- 
ing covert responses fail to learn the 
material in a program. The means of 
the covert response group and the read- 
ing groups (see Table 1) are consider- 
ably higher than the means of the con- 
trol group on both the immediate and 
delayed tests. The question of the rela- 
tive efficiency of each response mode is 
an important though difficult problem. 
Although the present study did not re- 
cord the amount of time students spent 
on the program, other studies have uni- 
formly reported that the written re- 


sponse mode required at least 10% more 
time than covert response modes. Is the 
slightly increased retention worth the 
slightly longer time required to write 
out each response, or might this addi- 
tional time be better spent in other 
ways? Such value questions face teach- 
ers and administrators every day and 
ought not to be ignored by researchers. 
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AND 


An experiment was conducted to ascertain the effects of accelerating 
from 2nd to 4th grade pupils of superior learning abilities (SLA) who 
were above the median CA of all 2nd graders and who had attended 
a 5-week summer session. The Ss were 1 group of 26 accelerated to 
the 4th grade; 2 groups of 26 nonaccelerated 3rd graders of SLA, 
1 above and 1 below median CA; 2 groups of 26 nonaccelerated 
4th graders of SLA, 1 above and | below median CA; and 2 groups 
of 26 nonaccelerated 4th graders of average learning ability, 1 above 
and 1 below median CA. Based on the performances of the Ss on 32 
measures, no unfavorable academic, social, emotional, or physical cor- 
relates of acceleration were found. Acceleration of pupils of SLA and 
above median CA after a 5-week summer session seemed desirable. 


The public elementary schools of Ra- 
cine, Wisconsin, are organized into age- 
graded classes—kindergarten through 
Grade 6. With few exceptions the chil- 
dren are at least 6 years of age on or 
before December 1 when entering the 
first grade. Children who are not subse- 
quently accelerated range in age from 
17-7 to 18-7 when graduating from high 
school, and 21-7 to 22-7 when complet- 
ing four years of college. 

Since there are as many bright chil- 

dren in the older half of all children 
entering the first grade as there are 
slower learning children in the younger 
half, the question is raised: 
Should not the older bright children be ac- 
celerated at some grade or school level so 
that they may complete high school at least 
as young as their less able classmates who 
happened to be born during a month which 
permitted them to be the younger children 
when entering the first grade? (Klausmeier, 
1958, p. 41). 


The purpose of this controlled experi- 


1This research was financed by the Wis- 
consin Improvement Program, Teacher Edu- 
cation and Local School Systems, John Guy 
Fowlkes, Director; and the Department of 
Education, University of Wisconsin. 


ment was to ascertain the effects of 
accelerating from the second to the 
fourth grade pupils of superior learning 
abilities who were above the median 
chronological age of all second graders 
in the Racine schools. 


PROCEDURE 
Subjects 


In March 1960, 32 girls and 20 boys in the 
upper half of their grade in chronological 
age and of superior learning abilities were 
identified from the entire second grade pop- 
ulation of the Racine, Wisconsin schools. 
These pupils were born after June 1, 1952, 
had a minimum Kuhlman-Anderson IQ of 
115, a minimum total standard score of 300 
on the Elementary Battery of the Metro- 
politan Achievement Tests, and a recom- 
mendation for acceleration from their teach- 
ers, The 52 pupils were ordered in pairs, 
matched by sex, and then randomly assigned, 
one from each pair to the accelerated group, 
the other to the control group of nonaccel- 
erates. The accelerates attended a 5-weck 
summer session in 1960 and were enrolled in 
16 different fourth grade classrooms in Sep- 
tember 1960; pupils in the control group did 
not attend the summer session and were 
enrolled in 15 different third grade class- 
rooms in September 1960. The accelerates 
attended the summer session in order to 
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complete the essential content of the third 
grade which had not been included in the 
second grade. Cursive handwriting, language 
arts, arithmetic, and expressive abilities— 
verbal, artistic, musical, and physical—re- 
ceived main attention. The accelerates met 
with their teacher in a self-contained class- 
room from 8:00 to 12:00 a.m., Monday 
through Friday. 

Five additional groups were identified, 
each with 16 girls and 10 boys, for compari- 
son with the accelerates. These were as 
follows: one group of third graders of su- 
perior learning abilities below the median 
age of third graders; two groups of fourth 
graders of superior learning abilities, one 
above and the other below the median age 
of fourth graders; two groups of fourth 
graders of average abilities, one above and 
the other below the median age of fourth 
graders. 

The younger third graders of superior 
learning abilities were born after June 1, 
1952; they met the same criteria of IQ and 
achievement as did the pupils randomly 
assigned to acceleration or nonacceleration. 
Pupils in the two fourth grade superior 
groups had a minimum IQ of 115 and a 
minimum total standard score of 400 on the 
achievement battery. Pupils in the older 


group were born before June 1, 1951; pupils 
in the younger group were born after June 1, 
1951. Pupils in the two fourth grade average 
groups had an IQ between 90 and 114 and 
a total standard score between 300 and 400 


on the achievement battery. Pupils in the 
older group were born before June 1, 1951; 
pupils in the younger group were born after 
June 1, 1951. 

Pupils in the four fourth grade groups had 
taken the IQ and achievement tests as third 
graders in March 1960, the same month 
during which the other three groups did as 
second graders. Pupils in the seven groups 
were enrolled in 16 different third and fourth 
grade classrooms in September 1960. 

Comparisons between the two groups, orig- 
inally matched by sex and then randomly 
assigned to acceleration or nonacceleration, 
were made on all data obtained in March 
1961 (at which time the accelerated pupils 
had been in the fourth grade for 7 months of 
the 10-month school year). These compari- 
sons were expected to demonstrate the effects 
of acceleration. Comparisons between the 
accelerated pupils and pupils in the other 
five groups were made to ascertain the extent 
to which the accelerates differed from young- 
er third graders of superior learning abilities, 


and from older and younger fourth graders 
of average and superior learning abilities. 


Types of Data Gathered and Measuring 
Instruments 


Nine types of data—educational achieve- 
ments, attitudes toward school and learning, 
problem solving ability, ethical values, hand- 
writing skills, psychomotor abilities, intellec- 
tual and affective characteristics, peer ac- 
ceptance, and creative thinking abilities— 
were secured on pupils in March 1961. A 
brief description of the instruments used to 
obtain these data follows. 

Educational Achievements. The Metro- 
politan Achievement Tests, Intermediate Bat- 
tery-Partial were administered to all pupils. 
Scores were obtained for Word Knowledge, 
Reading, Spelling, Language Total, Lan- 
guage Study Skills, Arithmetic Computation, 
Arithmetic Problem Solving and Concepts, 
and Total Battery. 

Attitudes toward School and Learning. A 
ranking scale and an inventory were devised 
by the writers to assess the pupils’ attitudes 
toward school and learning. The ranking 
scale, titled Places Liked Best, required the 
pupil to rank the school classroom and eight 
other places from highest to lowest on the 
basis of preference for the places. The Atti- 
tude Inventory consisted of 20 multiple- 
choice items. 

Problem Solving Ability. A test of 25 
multiple-choice items was devised by the 
writers to assess a composite problem solving 
ability as represented in items dealing with 
analogies, logical reasoning, pertinent ques- 
tions, sensitivity to problems, and following 
directions. 

Ethical Values. A multiple-choice inven- 
tory was devised by the writers and research 
assistant, William Franzen, to assess ethical 
values. Responses were keyed as to whether 
they indicated rational conscientiousness, ir- 
rational conscientiousness, group conformity, 
or self-centeredness. The inventory yielded 
a score for each of these four traits. 


Handwriting Skills. A speed score was 
obtained by having each pupil write a stand- 
ard sentence as rapidly as he could for 3 
minutes. The score was a count of the 
number of words written. A legibility score 
was obtained by having each pupil write the 
standard sentence once at his normal speed 
of handwriting. This sample was rated by 
three judges on a nine-point scale. The pro- 
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cedures of Virgil E. Herrick, University of 
Wisconsin, for administering and scoring 
these tests were followed. 

Psychomotor Abilities. Four tests of psy- 
chomotor abilities were devised and admin- 
istered by Grace Piskula, Physical Education 
Supervisor of the Racine public schools: 
Basketball Throw to measure strength and 
coordination of body and arms, Jump and 
Reach to measure coordination of arms and 
legs, Wall Pass to measure eye-hand coordi- 
nation and speed of reaction, and Shuttle 
Run to measure running speed and maneu- 
verability, 

Intellectual and Affective Characteristics. 
A Teacher Rating Scale was devised by the 
writers consisting of 12 descriptive items. 
The teacher rated the pupil on each item, 


using a four-point scale. Weights of one to 
four were given to each item and a total 
score was obtained by adding the weighted 
scores. In addition, three items dealing with 
the child’s emotional, social, and physical 
development were checked by the teacher to 
indicate normal or unusual development. 

Peer Acceptance. A sociometric instrument 
was used in which the pupils listed their five 
best friends in rank order. A score for each 
pupil was obtained by tabulating the choices 
received, weighting the choices, five for a first 
choice, four for a second choice, etc., and 
adding the weighted scores. 

Creative Thinking Abilities. Seven tests of 
creative thinking yielding 10 scores were ad- 
ministered to all pupils: Object Uses, Word 
Uses, Plot Titles, Expressional Fluency, Plot 


TABLE 


ReviaBitity or Group Tests 


Test 


Basis or 
source of r 


Metropolitan Achievement 
Test-Intermediate Battery: 
Word Knowledge 
Reading 
Spelling 
Language Total 
Language Study Skills 
Arithmetic Computation 
Arithmetic Problem Solving and Concepts 
Social Studies Study Skills 
Places Liked Best 
Attitude Inventory 
Creative Thinking Battery: 
Object Uses-Fluency 
Object Uses-Flexibility 
Word Uses-Fluency 
Word Uses-Flexibility 
Plot Titles-Fluency 
Plot Titles-Cleverness 
Expressional Fluency 
Plot Questions 
Object Improvement 
Sentence Improvement-Metaphors 
Sentence Improvement-Onomatopocia 
Ethical Values Inventory: 
Rational Conscientious 
Irrational Conscientious 
Group Conforming 
Self-Centered 


Handwriting Legibility 


Test manual 
Test manual 
Test manual 
Test manual 
Test manual 
Test manual 
Test manual 
Test manual 


Alternate forms of test 
Cronbach Alpha 


2 independent subtests 
2 independent subtests 
2 independent subtests 
2 independent subtests 
2 independent subtests 
2 independent subtests 
Cronbach Alpha 
Cronbach Alpha 
Cronbach Alpha 
Cronbach Alpha 
Cronbach Alpha 


Cronbach Alpha 
Cronbach Alpha 
Cronbach Alpha 
Cronbach Alpha 


2 independent raters 
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.92 
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.29 
57 
89 
44 
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Questions, Object Improvement, and Sen- 
tence Improvement. These tests, except Sen- 
tence Improvement, were adapted from Guil- 
ford (Guilford, Kettner, & Christensen, 
1956) by research assistant, Frank B. May, 
for use with children. Sentence Improvement 
was devised by May. The reliability and 
validity coefficients for these instruments are 
reported by Ripple (1961) and May (1961) ; 
reliability coefficients only are reported in 
Table 1. 


Treatment of the Data 


The analysis of variance was used to deter- 
mine the significance of differences among 
the seven groups in chronological age and 
IQ, criteria used in the selection process. 
Also, an analysis of variance for total battery 
achievement scores and for scores on the 
Teacher Rating Scale was run on the two 
second grade groups, originally matched by 
sex and then randomly assigned to accelera- 
tion or nonacceleration, When an F signifi- 
cant at the .05 level was obtained, confidence 
intervals (Dunnett, 1955) were built around 
the differences between the accelerated group 
mean and each of the other six group means.? 

A similar treatment was applied to all data 
obtained in March 1961. A two-way classi- 
fication was used in the analysis of variance, 
sex by group, to test differences according to 
group, sex, and interactions between sex and 
group. When a significant interaction was 
found, confidence intervals were built sepa- 
rately for girls and boys. In addition, a chi 
square test was applied to the three develop- 
ment items on the Teacher Rating Scale. Chi 
square was used to test differences in the 
incidence of normal and unusual develop- 
ment of pupils in the seven groups. 

When the confidence interval around a 
difference between means did not span zero, 
the difference was interpreted as significant. 


2A table giving the mean scores and results 
of tests of significance for the 1960 data; a 
table giving the same information for the 
1961 data; and a table showing the incidence 
of unusual physical, social, and emotional de- 
velopment have been deposited with the 
American Documentation Institute. Order 
Document No, 7073 from ADI Auxiliary 
Publications Project, Photoduplication Serv- 
ice, Library of Gongress, Washington 25, D. 
C., remitting in advance $1.25 for microfilm 
or $1.25 for photocopies. Make checks pay- 
able to: Chief, Photoduplication Service, 
Library of Congress. : 


In Tables 2, 3, and 4 only the significance of 
differences is reported. A more detailed pres- 
entation of the confidence intervals is pre- 
sented by Ripple (1961). 


RESULTS 


Differences between the accelerated 
group and the six comparison groups on 
measures where the interaction was not 
significant are summarized in Table 2. 
The interpretations of these differences 
are based on the seven groups rather 
than on girls and boys separately. Since 
confidence intervals were built for the 
sexes separately on measures where a 
significant interaction was encountered, 
separate summary tables for girls and 
boys are presented for these measures, 
Table 3 for girls and Table 4 for boys. 

Abbreviations are used to designate 
each of the seven groups of pupils in the 
tables as follows: 

Acc—refers to the experimental group 
of pupils of superior learning abilities 
identified as older second graders in 
March 1960, and subsequently accele- 
rated to the fourth grade in September 
1960. 

3SO—refers to the control group of 
superior learning abilities (the group 
randomly assigned to nonacceleration) 
identified as older second graders in 
March 1960 and subsequently enrolled 
in the third grade in September 1960. 

3SY—trefers to the control group of 
superior learning abilities identified as 
younger second graders in March 1960 
and subsequently enrolled in the third 
grade in September 1960. 

4SO—refers to the older control 
group of superior learning abilities iden- 
tified on the basis of third grade (March 
1960) IQ and achievement scores and 
subsequently enrolled in the fourth 
grade in September 1960. 

4SY—tefers to the younger control 
group of superior learning abilities iden- 
tified on the basis of third grade (March 
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1960) IQ and achievement scores and 
subsequently enrolled in the fourth 
grade in September 1960. 

4AO—refers to the older control 
group of average learning abilities iden- 
tified on the basis of third grade (March 
1960) IQ and achievement scores and 
subsequently enrolled in the fourth 
grade in September 1960. 

4AY—tefers to the younger control 
group of average learning abilities iden- 


tified on the basis of third grade (March 
1960) IQ and achievement scores and 
subsequently enrolled in the fourth 
grade in September 1960. 

As can be seen in Table 2, the acceler- 
ated group (Acc) and the nonacceler- 
ated control group (3SO) were not sig- 
nificantly different in chronological age, 
Kuhlmann-Anderson IQ, Metropolitan 
educational achievements, and teacher 
ratings used in the selection process in 


TABLE 2 


SumMary OF DIFFERENCES BETWEEN THE ACCELERATED GROUP AND THE Six COMPARISON 
Groups ON MEASURES WHERE THE INTERACTION Was Not SIGNIFICANT 


Not significantly 
different from 
Acc group 


Significantly 
lower than 
Acc group 


Significant 
Acc group 


March 1960 measures 
Chronological Age 
IQ (Kuhlmann-Anderson ) 
Teacher Ratings 
Metropolitan Achievement Test- 
Elementary Battery 


March 1961 measures 
Metropolitan Achievement Test. 
Intermediate Battery: 
Arithmetic Computation 
Arithmetic Problem Solving 
and Concepts 

Attitude toward School, Learning: 
Places Liked Best 
Attitude Inventory 

Problem Solving Test 

Ethical Values Inventory: 
Rational Conscientious 
Irrational Conscientious 
Group Conforming 

Handwriting Speed 

Psychomotor Abilities: 
Jump and Reach 
Wall Pass 
Shuttle Run 

Teacher Rating Scale: 


380 
3SO, 4SO, 4SY 
3sO 


48Y, 4A0 
4sY 


380, 3SY, 4SY 


3SY, 4SO, 4SY, 4A0, 4AY 
3SY, #SO, 4SY, 4A0, 4AY 


3SY, 4SO, 4SY, 4AY 
3SY, 4SO, 4SY, 4A0, 4AY 
3SO, 3SY, 4SO, 4SY, 4AO, 4AY 
380, 3SY, 4SY, 4AO, 4AY 


3SY, 4SO, 4SY, 4A0, 4AY 
380, 3SY, 4SY, 4AO, 4AY 
3SO, 3SY, 4SO, 4AY 


3SY 4SO, 4SY, 4A0, 4AY 
4A0, 4AY 3SY 


380, 3SY, 4AY 
3SY, 4AO, 4AY 


4A0, 4AY 


Total first 12 items 


Adi 


Social Adjustment 
Physical Coordination 
Creative Thinking Tests: 

Object Uses-Fluency 

Object Uses-Flexibility 

Word Uses-Fluency 

Word Uses-Flexibility 

Plot Titles-Fluency 

Plot Titles-Cleverness 

Expressional Fluency 

Plot Questions Sensitivity 

Object Improvement 


3SO, 3SY, 4SO, 4SY 
3SO, 3SY, 4SO, 4SY, 4AO, 4AY 
3SO, 3SY, 4SO, 4SY, 4AO, 4AY 
3SO, 3SY, 4SO, 4SY, 4AO, 4AY 


3SO, 3SY, 4SY, 4AY 

3SY, 4SY, 4AO, 4AY 

380, 3SY, 4SY, 4A0, 4AY 

3SY, 4SO, 4SY, 4AY 
3SO, 3SY, 4SO, 4SY, 4A0, 4AY 
3SY, 4SO 

380, 3SY, 4SO, 4SY 

4SY 

3SO, 3SY, 4SO, 4SY, 4AO 


380, 4SY, 4AO, 4AY 
4A0, 4AY 

3SO, 3SY, 4AY 
4AY 
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March 1960. Table 2 also presents dif- 
ferences between the accelerated group 
and the nonaccelerated control group 
(3SO) on measures obtained in March 
1961, where the interaction between sex 
and group was not significant. It should 
be noted that the nonaccelerated con- 
trol group was not significantly higher 
than the accelerated group on any of 
the measures in Table 2. On four meas- 
ures—arithmetic computation, arithme- 
tic problem solving and concepts, Plot 
Titles-Cleverness, and Plot Questions- 
Sensitivity—the accelerated group was 
significantly higher than the nonacceler- 
ated control group. 

Table 3 presents comparisons between 
the accelerated girls and girls in the six 
other groups on 11 measures where the 
interaction between sex and group was 
significant. It should be noted that the 
nonaccelerated control girls (3SO) were 
not significantly higher than the accel- 
erated girls on any of the measures in 
Table 3. On four of the measures— 


total battery score, reading, spelling, 
and language total from the Metropoli- 
tan Achievement Tests—the accelerated 
girls were significantly higher than the 
nonaccelerated girls. 

Table 4 presents comparisons between 
the accelerated boys and boys in the six 
other groups on 11 measures where the 
interaction between sex and group was 
significant. The nonaccelerated boys 
(3SO) were significantly higher than 
the accelerated boys only in acceptance 
by peers as measured with the socio- 
metric instrument. These two groups of 
boys were not significantly different on 
the other 10 measures presented in 
Table 4. 

The many measures on which the 
accelerated pupils were not significantly 
different from or higher than the non- 
accelerated control pupils can be ob- 
served in Tables 2, 3, and 4. Also, the 
comparisons of the accelerated pupils 
with pupils in the other five groups can 
be observed in the tables. In general, 


TABLE 3 


SuMMARY OF DIFFERENCES BETWEEN ACCELERATED GIRLS AND GIRLS IN THE Six Com- 
PARISON Groups ON 11 MeasuRES WHERE THE INTERACTION WaAs SIGNIFICANT 


Not significantly 
different from 
Acc girls 


Significantly 
lower than 
Acc girls 


March 1961 measures 
Metropolitan Achievement 
Test-Intermediate 
Battery: 
Total Battery Score 
Word Knowledge 
Reading 
Spelling 
Language Total 
Language Study Skills 
Social Studies Study Skills 
Handwriting Legibility 
Psychomotor Abilities: 
Basketball Throw 
Sociometric Instrument 
Creative Thinking Tests: 
Sentence Improvement 


4SY 
3SO, 3SY, 4SY 

480, 4SY 

3SY, 4SO, 4SY, 4AO 


3SO, 4SO, 4SY, 4AO, 4AY 
4SO, 4SY, 4AY 
3SO, 3SY, 4SO, 4SY, 4AY 


3SO, 3SY, 4SO, 4SY, 4AO, 4AY 
3SO, 3SY, 4SO, #SY, 4AO, 4AY 


3SO, 3SY, 4SO, 4SY, 4AO, 4AY 


3SO, 3SY, 4AY 
4AO, 4AY 

3SO, 3SY, 4AO, 4AY 
3SO, 4AY 

3SO, 3AY, 4AO, 4AY 
3SY 

3SY, 4AO 


4AO 
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TABLE 4 


SuMMARY OF DIFFERENCES BETWEEN ACCELERATED Boys AND Boys IN THE Six Com- 
PARISON Groups ON 11 MrEasuRES WHERE THE INTERACTION Was SIGNIFICANT 


Measure 


Not significantly 
different from 


Significantly 
lower than 
Acc boys 


Significantly 
higher than 


Acc boys Acc boys 


March 1961 measures 
Metropolitan Achievement 
Test-Intermediate Battery: 
Total Battery Score 
Word Knowledge 
Reading 
Spelling 
Language Total 
Language Study Skills 
Social Studies Study Skills 
Handwriting Legibility 
Psychomotor Abilities: 
Basketball Throw 
Sociometric Instrument 


Creative Thinking Tests: 
Sentence Improvement 


3SO, 3SY, 4SY 
3SO, 3SY, 4SY 

3SO, 3SY, 4SY 

3SO, 3SY, 4SO, 4SY, 4AY 

3SO, 3SY, 4SY 

3SO, 3SY, 4SO, 4SY 

3SO, 3SY, 4SO, 4SY 

3SO, 3SY, 4SO, 4SY, 4AO, 4AY 


3SO, 3SY, 4SY, 4A0, 4AY 
3SY, 4SY, 4A0, 4AY 


3SO, 3SY, 4SY, 4AO, 4AY 


4A0, 4AY 
4A0, 4AY 
4AO, 4AY 
4A0 

4A0, 4AY 
4A0, 4AY 
4AO, 4AY 


the accelerated pupils (Acc) were sig- 


nificantly higher than the younger third 
graders of superior learning abilities 
(3SY), significantly lower than older 
fourth graders of superior learning abili- 
ties (4SO), not significantly different 
from younger fourth graders of superior 


learning abilities (4SY), and _ signifi- 
cantly higher than older and younger 
fourth graders of average learning abili- 
ties (4AO and 4AY). 


DIscUSSION 


Terman (Terman & Oden, 1947), 
Lehman (1953), and Pressey (1955), 
and many others have cited the need for 
early identification of superior pupils 
and have recommended that these pu- 
pils progress more rapidly through 
school and college. Since the accelerated 
pupils in this experiment were among 
the oldest for their grade, a greater sense 
of urgency surrounds the question of 


their acceleration. Assuming the usual 
progression of one grade per year, the 
accelerated pupils in this study will 
graduate from high school at age 17-1 
to 17-7 rather than 18-1 to 18-7. This 
will permit college entrance and eventu- 
ally starting their productive careers 
one year earlier than would have been 
possible without the acceleration. 

The data on which the comparisons 
were based were collected in the seventh 
month of the first school year following 
the acceleration. Since the accelerates 
performed as well or better than their 
older third grade controls at that time, 
their acceleration was not harmful. The 
only observed negative effect was rela- 
tively lower peer acceptance for the 
accelerated boys. Subsequent research 
will determine if this decrease was only 
temporary. The accelerated and other 
pupils will be studied as they progress 
through their school careers. 
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SCRAMBLED VERSUS ORDERED SEQUENCE 
IN AUTOINSTRUCTIONAL PROGRAMS 


K. VLACHOULI ROE, H. W. CASE, anp A. ROE 
University of California, Los Angeles 


A 71-item autoinstructional program on elementary probability was 
presented in scrambled and in properly ordered sequence, respectively, 
to 2 groups—each of 18 psychology students classified according to 
prior mathematical ability. Students proceeded once through the 
linear program at their own pace and were given a test immediately 
after. The sequence of items had no significant effect on (a) time 
required for learning, (b) error score during learning, (c) criterion 
test score, and (d) time required for criterion test. This suggests that 
rather than being an absolute requirement, the careful sequencing of 
items in an autoinstructional program may be a function of such vari- 
ables as length of program, information content of items, and individ- 


ual learner differences. 


Teaching machine instruction has 
been characterized as having four sup- 
posedly important properties: 


1. The subject matter is broken into 


small, carefully ordered items, to form a 
graded series. 

2. The student is allowed to respond 
explicitly to each of the items. 


3. The student is informed immedi- 
ately after his response whether the re- 
sponse is correct and thus has a means 
of immediately correcting wrong re- 
sponses. 

4. The student can proceed on an 
individual basis in accordance with his 
own rate of learning. 

The first property has been identified 
by B. F. Skinner (1958) and R. Glaser 
(1961) as the principle of gradual pro- 
gression. Glaser has indicated that a 
gradual progression is necessary to 
establish complex repertoires. 

In an experiment (Roe, Massey, 
Weltman, & Leeds, 1960) with college 
freshman engineering students, it was 
found that if a carefully ordered se- 
quence of learning items was prepared, 
students learned equally well whether 
they were required to compose their 
responses, make multiple-choice selec- 
tions, or give no overt responses at all. 
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There was no apparent difference in 
student learning if the sequence was 
presented by a machine, a programed 
textbook, or a lecturer. It seemed that 
of the four points mentioned above, 
only the first, namely the sequence of 
items, was important. However, the im- 
portance of the careful ordering of items 
became suspect when it was discovered 
that a student, who failed to read the 
introductory instructions of a pro- 
gramed textbook, read down the page 
instead of from page to page so that the 
sequence of items he saw was numbered 
1, 40, 79, 118, 157, 2, 41, 80, 119, 158, 
3, 42, and so on. This student stil] man- 
aged to get a high score on the criterion 
test. 

This report deals with an experiment 
designed to determine whether the pres- 
entation of a proper sequential ordering 
of related subject items does affect the 
terminal performance of a student dif- 
ferently from the presentation of a ran- 
dom ordering of the same subject items. 
Stated as a hypothesis: The mean per- 
formance in a criterion test of students 
who have studied a proper sequential 
ordering of related subject items will be 
significantly different from the mean 
performance on the same criterion test 


q 4 fas 
i 
\ 
afi 
> 
¥ 
¥ 
‘ 


102 K. V. Roz, H. W. Casz, anp A. Rog 


by students who have studied a random 
ordering of the subject items. The hy- 
pothesis is based on the use of subject 
items which are related, with one item 
normally depending on a preceding 
item, and on the students’ terminal per- 
formance, rather than their perform- 
ance on intermediate items. A variable 
which could affect the hypothesis is the 
number of items used in one: learning 
session. It is obvious that if the program 
consisted of only three randomly or- 
dered items, many students could mem- 
orize them, mentally unscramble or re- 
order them, and perform adequately on 
a terminal test. If the program con- 
sisted of 200 items, the problem of men- 
tally storing and unscrambling them 
should be much more difficult. In this 
first experiment, the number of items 
variable was not explored. Only enough 
items to make up what would normally 
be a one-hour learning session (71 items, 
in this case) were used. 

Shortly before this experiment was 
conducted, the work of Gavurin and 
Donahue (1961) on logical sequence 
and random sequence was brought to 
the authors’ attention. Gavurin and 
Donahue used a program of 29 items, 
with randomization occurring only with- 
in each block. Subjects in the second 
group, who presumably were not 
matched for verbal ability or prior 
knowledge of the material, were re- 
quired to repeat all missed items within 
a block before proceeding to the next 
block. A criterion test was not admin- 
istered until one month after the learn- 
ing session. Gavurin and Donahue 
found that subjects who studied the 
random sequence _made significantly 
more errors than tlose who studied the 
logical sequence. However, after one 
month there was no significant differ- 
ence in the amount of retention of the 
material between the two groups. 

It was anticipated that the currert 
experiment would shed some light on 
the sequencing variable by (a) scram- 


bling larger blocks of items, (b) choos- 
ing subject matter which had a mini- 
mum of “word” responses that may not 
require logical sequencing, (c) account- 
ing for prior ability of subjects, (d) 
eliminating the repetition of missed 
items, which may tend to exaggerate 
the error score, and (¢) eliminating the 
leveling effect of time on long-term 
retention by administering the criterion 
test immediately after the learning 
session. 


METHOD 


A group of 36 freshman psychology stu- 
dents were classified into upper, middle, and 
lower thirds according to their prior mathe- 
matical ability as indicated by quantitative 
scores on the College Entrance Examination 
Board (CEEB) examinations. Within each 
third, students were randomly assigned to 
each sequencing method. No pretest was 
administered because previous studies with 
the program used indicate that lower division 
college students have little prior knowledge 
of the subject matter, and that the Sresence 
of such prior knowledge does not correlate 
with terminal performance after studying the 
programed items. 

The learning items consisted of 71 frames 
taken from the program on elementary prob- 
ability developed over the past 2 years by 
members of the Teaching Systems Research 
Project at the Department of Engineering, 
University of California, Los Angeles. The 
concepts covered include relationship be- 
tween information and degree of certainty, 
deterministic vs. probabilistic problems, prob- 
ability ratio, additive law of probability, 
multiplicative law of probability, sampling 
with and without replacement, and mutually 
exclusive and independent events. 

Each frame, printed on a 4- by 6-inch 
card, consisted of a learning item followed 
by a statement requiring multiple-choice 
completion. The student had to remove a 
patch of opaque rubber cement (to which 
powdered graphite had been added) below 
one of the possible answers, thereby record- 
ing his choice as well as uncovering the cor- 
rect response. 

Students were gathered in a classroom and 
given instruction sheets and boxes containing 
the learning items. Half the students re- 
ceived boxes with an ordered sequence of 
items, and the other half received boxes with 
a scrambled sequence of items. Immediately 
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on completing the program, each student was 
given a criterion test. Times for completing 
the learning session and the criterion test 
were recorded. The boxes of cards were sub- 
sequently examined to find the number of 
errors made during the learning session. 


RESULTS 


Chi square tests of normality of dis- 
tributions were made for all variables 
and Bartlett’s test for homogeneity of 
variances was made for all pertinent 
variances. In no case were the assump- 
tions of normality or homogeneity of 
variances rejected, though it was recog- 
nized that for the sample sizes involved 
these tests were not powerful. Also ¢ 
tests were made to ascertain that the 
three (mathematical aptitude) strata 
were significantly different from one 
another and that, within each stratum, 
the groups were not significantly differ- 
ent from one another. 

Analyses of variance were then per- 
formed on the following data: time re- 
quired for learning, error score during 
learning, time required for criterion test, 
and criterion test scores. The two vari- 
ables in each of these analyses of vari- 
ance were the method of sequencing 
items and the students’ mathematical 
aptitude. 

Since programed instruction is de- 
signed to bring most students up to a 
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high level of performance on criterion 
items, it was expected that the distribu- 
tion of criterion test scores would be 
skewed towards the high end (nega- 
tively skewed). Though this skewing 
was not large enough to negate the 
normality assumptions (for the limited 
size of the sample) , nonparametric tests 
were also used on the criterion test 
scores. In all cases, the nonparametric 
test results were consistent with the an- 
alyses of variance results. 

In the analysis of variance, it was 
found that the sequence of items had 
no significant effect on the dependent 
variables, nor was there any significant 
effect on the interactions between se- 
quencing and aptitude. However, prior 
mathematical aptitude did have a sig- 
nificant effect on error score and test 
score. With this in mind an analysis of 
covariance was made, where the indi- 
vidual CEEB examination score on 
mathematical ability was used as a 
covariate with each student’s experi- 
mental performance score. The results 
are summarized in Table 2. 


Discussion 


Considerable importance had been 
attached to the careful sequencing of 
autoinstructional items by most program 


TABLE 1 


MEANS AND STANDARD DEVIATIONS 


Time required 
for learning 
(minutes ) 


Sequence: 
Scrambled 
Ordered 
Mathematical aptitude : 
Upper third 
Middle third 
Lower third 


during learning 


maximum 71) 


Error score 
Criterion test 
scores (possible 

maximum = 10) 


Time required for 
criterion test 
(minutes) 


( possible 


fe 
| 
| 
| 
| M | SD | M | | | sD | M | SD 
| | | 
a 45.0 7.9 | 10.2 4.7 | 15.6 35 | 7.3 2.4 . 
46.5 8.7 | 9.3 4.6 | 15.9 5.5 | 6.3 2.3 
3 45.75 7.7| 7.3 3.9 | 14.7 3.0 | 8.0 2.0 
44.5 98 | 95 3.1 | 16.1 4.7 | 7.5 2.1 
a 47.3 7.5 | 124 4.2 | 16.4 5.7 | 5.3 2.3 
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TABLE 2 


ANALYSIS OF COVARIANCE 


Method 


Scrambled 
Ordered 
Scrambled 
Ordered 
Scrambled 
Ordered 


Test score (maximum = 10.0) | Scrambled 
Ordered 


Learning time (minutes) 
Error score (out of 71) 


Test time (minutes) 


Adjusted M | Adjusted SD | F ratio | Significance 
45.1 7.9 
46.6 87 0.157 ns 
10.2 3.8 
93 46 1.222 ns 
15.6 3.4 
15.9 5.5 0.003 ns 
7.3 2.4 
6.3 22 ne 


writers; and it was thought that the stu- 
dent mentioned in the introduction, 
who, in an earlier experiment, had fol- 
lowed an improper sequence and still 
scored high on the criterion test, must 
have been unusually gifted. The results 
of this small-scale experiment, however, 
seem to indicate that college level stu- 
dents may not require the careful se- 
quencing of autoinstructional items as 
had previously been supposed. A clue 
to the possible reason comes from a con- 
versation with one of the students im- 
mediately after the conclusion of the 
experiment. 

Student: What kind of program was that? 
Experimenter: Why do you ask? 

Student: I was finding all kinds of things 


I didn’t know and couldn’t answer and was 
getting curious about what was going on. 


Experimenter: What did you do? 

Student: Well, I kept looking for the 
information for the items I couldn’t answer, 
and when I found them later on I felt very 
glad. 

Possibly, a closely ordered sequence 
of items, in which the information for 
completing a given item has been sup- 
plied in immediately preceding items, 
can be followed easily and without 
much more than short-time recall atten- 
tion. Presenting items out of sequence 
possibly introduced a task oriented 
anxiety which was subsequently relieved 
in a moment of revelation when a miss- 


Note.—N = 36; df = 1/33. Covariate means: scrambled, 573.8; ordered, 552.9. 


ing clue was discovered. That the stu- 
dents using the scrambled sequence of 
items were possibly more alert is indi- 
cated by the fact that the five comments 
concerning typographical errors in the 
program were made by students in this 
group. 

It is difficult to generalize from an 
experiment using a limited number of 
students studying a narrowly defined 
subject area in an ad hoc situation. It 
would be interesting to examine the 
scrambled versus ordered sequence 
question with students of various age 
levels, with programs of different length 
and in different subject areas, and with 
retention tests made at spaced intervals 
of time. 
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Learning Theory Describe the Learning Process? How +joes 
Skinner’s Operant Conditioning Work? What Is the Cor :;‘tive- 
Field Theory of Learning? How Is Learning Related to tach- 
ing? How Is Teaching Related to Intelligent Behavior? How 
May Teachers Teach for Understanding? How May Teachers 
Teach Reflectively? Why Is Climate Making a Part of Method ? 

520 pp. $7.50 


LEARNING AND HUMAN ABILITIES: 
EDUCATIONAL PSYCHOLOGY 


Herbert J. Klausmeier 


“A source book with a wealth of material to add substance and 
depth to [the educational psychology] course. . . . An admirable 
book.” The Clearing House 562 pp. $7.50 


PSYCHOLOGY IN EDUCATION, 
Revised Edition 


Sidney L. Pressey, Francis P. Robinson, & 
John E. Horrocks 
“A major contribution to the psycho-educational field. Numer- 
ous, concrete suggestions for practical application of specific 


psychological principles.” Personnel and Guidance Journal 
5d pp. $6.50 


HARPER & BROTHERS 
49 E. 33d. St., New York 16, N.Y. 
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Publication 
Manual 


of the 


American 


Psychological 


Association 


1957 Revision 


A revision of the 1952 Manual, de- 
tailed instructions are given for the 
preparation of scientific articles. Or- 
ganization and presentation of tabular 
material, figures and graphs, and refer- 
ence list are included. Scientists writ- 
ing for publication will find the Pub- 
lication Manual an indispensable guide. 


Price, $1.00 


Order from 


AMERICAN 
PSYCHOLOGICAL 
ASSOCIATION 


Order Department 
1333 Sixteenth Street, N. W. 
Washington 6, D. C. 
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Psychology 
and Mental Health 


A report of the Institute on Education and Training 
for Psychological Contributions to Mental Health, 
held at Stanford University in August, 1955. 


Edited by CHARLES R. STROTHER 


Topics discussed at the Institute: 


Training Needs of Psychologists in Com- 
munity Mental Health 


Specialization in Training 
Practicum Training 
Training for Therapy 


Training for Research in the Mental 
Health Field 


Problems of Departmental Organization 


Price: $1.75 


Order from: 


American Psychological Association, Inc. 
1333 Sixteenth Street, N. W. 
Washington 6, D. C. 


Ax 
| 
he 
vale 
f 
‘ 
haa 


GRADUATE 
EDUCATION 
IN 
PSYCHOLOGY 


Report of the Conference on Graduate Education 
in Psychology, sponsored by the Education and 
Training Board of the American Psychological As- 
sociation and supported by a grant from the Na- 
tional Institute of Mental Health, U. S. Public 
Health Service; held at Miami Beach, Florida, 
November 29 to December 7, 1958. 


Prepared by the Editorial Committee: 


Anne Roe, Chairman, 
and 


John W. Gustad, Bruce V. Moore, 
Sherman Ross, and Marie Skodak 


Price $1.50 


THE AMERICAN 
PSYCHOLOGICAL ASSOCIATION 
Dept. Grad 
1333 Sixteenth Street, N.W. 
Washington 6, D. C. 
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PSYCHOLOGY AND TEACHING 


G. Max Wingo William C. Morse 
University of Michigan 


PSYCHOLOGY AND TEACHING is an educational psychology 
text in which principles of teaching and learning are devel- 
oped through both analysis of classroom examples and a re- 
view of research. The revised edition places increased em- 
phasis on the psychology of learning through new chapters on 
Motivation and Learning, Retention and Transfer of Learn- 
ing, and Socio-Emotional Problems in the Classroom; in ex- 
panded studies and references and in a 20-page section on 
statistical concepts for teachers. 


Spring 1962 544 pages $6.75 


SCOTT, FORESMAN AND COMPANY 
Chicago Atlanta Dallas Palo Alto Fair Lawn, N. J. 
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A SELECTED LIST IN 
EDUCATIONAL PSYCHOLOGY 


READINGS IN THE PSYCHOLOGY OF HUMAN 
GROWTH AND DEVELOPMENT 


Warren R. Baller, University of Nebraska 


articles (relating closely to those topics covered in child and ado- 
of the 15 sections is clearly introduced and organized. March 1962, 
704 pp., $4.75 paper. Makes a particularly good companion for— 


THE PSYCHOLOGY OF HUMAN GROWTH 
AND DEVELOPMENT 
Warren R. Baller; Don C. Charles, lowa State University 


for a new kind of one-semester introductory course 
is objective, factual, and meaningful . from colleagues in the 
field. 1961, 384 pp. $5.00 


COUNSELING: READINGS IN THEORY AND PRACTICE 
John F. McGowan, University of Missouri; Lyle D. Schmidt, 
The Ohio State University 

An outstanding new collection: articles from recent years, all di- 
rectly related to high school counselor training. The two editors 
have organized, integrated, and introduced them—and have added 
practical suggestions, too. May 1962, 640 pp. $7.50 (tent.) 


THE LEARNING PROCESS AND PROGRAMMED 
INSTRUCTION 
Edward J. Green, Dartmouth College and Medical School 


All basic information needed on conditioning principles and 
programming techniques by (1) advanced psychology students, 
(2) beginners in programmed instruction, or (3) anyone involved 
in the production or usage of programmed materials. 

April 1962, 240 pp. 


$4.00 (tent.) 


BOOKS FROM 
HOLT, RINEHART and WINSTON 


383 Madison Avenue, New York 17 
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